diff options
| author | Juan Marín Noguera <juan@mnpi.eu> | 2025-05-16 22:18:44 +0200 |
|---|---|---|
| committer | Juan Marín Noguera <juan@mnpi.eu> | 2025-05-16 22:18:44 +0200 |
| commit | 4f670b750af5c11e1eac16d9cd8556455f89f46a (patch) | |
| tree | e0f8d7b33df2727d89150f799ee8628821fda80a /3.3.1.lyx | |
| parent | 16ccda6c459c0fd7ca2081e9d541124c28b0c556 (diff) | |
Changed layout for more manageable volumes
Diffstat (limited to '3.3.1.lyx')
| -rw-r--r-- | 3.3.1.lyx | 608 |
1 files changed, 0 insertions, 608 deletions
diff --git a/3.3.1.lyx b/3.3.1.lyx deleted file mode 100644 index 6fdc64d..0000000 --- a/3.3.1.lyx +++ /dev/null @@ -1,608 +0,0 @@ -#LyX 2.4 created this file. For more info see https://www.lyx.org/ -\lyxformat 620 -\begin_document -\begin_header -\save_transient_properties true -\origin unavailable -\textclass book -\begin_preamble -\input defs -\end_preamble -\use_default_options true -\maintain_unincluded_children no -\language english -\language_package default -\inputencoding utf8 -\fontencoding auto -\font_roman "default" "default" -\font_sans "default" "default" -\font_typewriter "default" "default" -\font_math "auto" "auto" -\font_default_family default -\use_non_tex_fonts false -\font_sc false -\font_roman_osf false -\font_sans_osf false -\font_typewriter_osf false -\font_sf_scale 100 100 -\font_tt_scale 100 100 -\use_microtype false -\use_dash_ligatures true -\graphics default -\default_output_format default -\output_sync 0 -\bibtex_command default -\index_command default -\float_placement class -\float_alignment class -\paperfontsize default -\spacing single -\use_hyperref false -\papersize default -\use_geometry false -\use_package amsmath 1 -\use_package amssymb 1 -\use_package cancel 1 -\use_package esint 1 -\use_package mathdots 1 -\use_package mathtools 1 -\use_package mhchem 1 -\use_package stackrel 1 -\use_package stmaryrd 1 -\use_package undertilde 1 -\cite_engine basic -\cite_engine_type default -\biblio_style plain -\use_bibtopic false -\use_indices false -\paperorientation portrait -\suppress_date false -\justification true -\use_refstyle 1 -\use_formatted_ref 0 -\use_minted 0 -\use_lineno 0 -\index Index -\shortcut idx -\color #008000 -\end_index -\secnumdepth 3 -\tocdepth 3 -\paragraph_separation indent -\paragraph_indentation default -\is_math_indent 0 -\math_numbering_side default -\quotes_style english -\dynamic_quotes 0 -\papercolumns 1 -\papersides 1 -\paperpagestyle default -\tablestyle default -\tracking_changes false -\output_changes false -\change_bars false -\postpone_fragile_content false -\html_math_output 0 -\html_css_as_file 0 -\html_be_strict false -\docbook_table_output 0 -\docbook_mathml_prefix 1 -\end_header - -\begin_body - -\begin_layout Standard -\begin_inset Note Note -status open - -\begin_layout Plain Layout -TODO 1, - 3, - 4, - 8, - 9, - 10, - 11 (2 pp., - 1:11) -\end_layout - -\end_inset - - -\end_layout - -\begin_layout Standard -\begin_inset ERT -status open - -\begin_layout Plain Layout - - -\backslash -exerc1[00] -\end_layout - -\end_inset - -What line of the chi-square table should be used to check whether or not the value -\begin_inset Formula $V=7\frac{7}{48}$ -\end_inset - - of Eq. - (5) is improbably high? -\end_layout - -\begin_layout Standard -\begin_inset ERT -status open - -\begin_layout Plain Layout - - -\backslash -answer -\end_layout - -\end_inset - -Row -\begin_inset Formula $\nu=k-1=11-1=10$ -\end_inset - -. -\end_layout - -\begin_layout Standard -\begin_inset ERT -status open - -\begin_layout Plain Layout - - -\backslash -rexerc3[23] -\end_layout - -\end_inset - -Some dice that were loaded as described in the previous exercise were rolled 144 times, - and the following values were observed: -\begin_inset Formula -\[ -\begin{array}{rrrrrrrrrrrr} -\text{value of }s= & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12\\ -\text{observed number, }Y_{s}= & 2 & 6 & 10 & 16 & 18 & 32 & 20 & 13 & 16 & 9 & 2 -\end{array} -\] - -\end_inset - -Apply the chi-square test to -\emph on -these -\emph default - values, - using the probabilities in (1), - pretending that the dice are not in fact known to be faulty. - Does the chi-square test detect the bad dice? - If not, - explain why not. -\end_layout - -\begin_layout Standard -\begin_inset ERT -status open - -\begin_layout Plain Layout - - -\backslash -answer -\end_layout - -\end_inset - -We take the values -\begin_inset Formula $np_{s}$ -\end_inset - - from (2), - to get: -\begin_inset Formula -\begin{align*} -V & =\sum_{s=2}^{12}\frac{(Y_{s}-np_{s})^{2}}{np_{s}}=\frac{4}{4}+\frac{4}{8}+\frac{4}{12}+\frac{0}{16}+\frac{4}{20}+\frac{64}{24}+\frac{0}{20}+\frac{9}{16}+\frac{16}{12}+\frac{1}{8}+\frac{4}{4}\\ - & =1+\frac{1}{2}+\frac{1}{3}+\frac{1}{5}+\frac{8}{3}+\frac{9}{16}+\frac{4}{3}+\frac{1}{8}+1=2+\frac{13}{3}+\frac{19}{16}+\frac{1}{5}\\ - & =7+\frac{80+45+48}{240}=7+\frac{173}{240}. -\end{align*} - -\end_inset - -Using -\begin_inset Formula $n=10$ -\end_inset - - we get a probability between -\begin_inset Formula $.25$ -\end_inset - - and -\begin_inset Formula $.5$ -\end_inset - -, - which is not suspect. - This seems to be because the bias of one die compensates that of the other, - smoothing out the probability differences. - The difference could be discovered with a large enough value of -\begin_inset Formula $n$ -\end_inset - -. -\end_layout - -\begin_layout Standard -\begin_inset ERT -status open - -\begin_layout Plain Layout - - -\backslash -rexerc4[23] -\end_layout - -\end_inset - -The author actually obtained the data in experiment 1 of (9) by simulating dice in which one was normal, - the other was loaded so that it always turned up 1 or 6. - (The latter two possibilities were equally probable.) Compute the probabilities that replace (1) in this case, - and by using a chi-square test decide if the results of that experiment are consistent with the dice being loaded in this way. -\end_layout - -\begin_layout Standard -\begin_inset ERT -status open - -\begin_layout Plain Layout - - -\backslash -answer -\end_layout - -\end_inset - -We compute the table with the sum of the two dice: -\begin_inset Formula -\[ -\begin{array}{r|rrrrrr} - & 1 & 2 & 3 & 4 & 5 & 6\\ -\hline 1 & 2 & 3 & 4 & 5 & 6 & 7\\ -6 & 7 & 8 & 9 & 10 & 11 & 12 -\end{array} -\] - -\end_inset - -This gives us the following table of probabilities: -\begin_inset Formula -\[ -\begin{array}{rrrrrrrrrrrr} -s= & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12\\ -144p_{s}= & 12 & 12 & 12 & 12 & 12 & 24 & 12 & 12 & 12 & 12 & 12 -\end{array} -\] - -\end_inset - -Thus, -\begin_inset Formula -\begin{align*} -V & =\frac{1}{12}\left(8^{2}+2^{2}+2^{2}+1^{2}+8^{2}+\frac{6^{2}}{2}+6^{2}+1^{2}+1^{2}+2^{2}+1^{2}\right)\\ - & =\frac{1}{12}(64+4+4+1+64+18+26+1+1+2+1)=\frac{186}{12}=15+\frac{1}{2}. -\end{align*} - -\end_inset - -With -\begin_inset Formula $n=10$ -\end_inset - -, - this is somewhat in the middle of -\begin_inset Formula $p=.75$ -\end_inset - - and -\begin_inset Formula $p=.95$ -\end_inset - -, - which is consistent. -\end_layout - -\begin_layout Standard -\begin_inset ERT -status open - -\begin_layout Plain Layout - - -\backslash -exerc8[00] -\end_layout - -\end_inset - -The text describes an experiment in which 20 values of the statistic -\begin_inset Formula $K_{10}^{+}$ -\end_inset - - were obtained in the study of a random sequence. - These values were plotted, - to obtain Fig. - 4, - and a KS statistic was computed from the resulting graph. - Why were the table entries for -\begin_inset Formula $n=20$ -\end_inset - - used to study the resulting statistic, - instead of the table entries for -\begin_inset Formula $n=10$ -\end_inset - -? -\end_layout - -\begin_layout Standard -\begin_inset ERT -status open - -\begin_layout Plain Layout - - -\backslash -answer -\end_layout - -\end_inset - -Because the value of -\begin_inset Formula $n$ -\end_inset - - to use is not about the underlying probability distribution (which can be an arbitrary real-valued one, - not just -\begin_inset Formula $K_{n}^{+}$ -\end_inset - - or -\begin_inset Formula $K_{n}^{-}$ -\end_inset - -), - but rather it is the number of observations we make for this distribution, - which is 20. -\end_layout - -\begin_layout Standard -\begin_inset ERT -status open - -\begin_layout Plain Layout - - -\backslash -rexerc9[20] -\end_layout - -\end_inset - -The experiment described in the text consisted of plotting 20 values of -\begin_inset Formula $K_{10}^{+}$ -\end_inset - -, - computed from the maximum-of-5 test applied to different parts of a random sequence. - We could have computed also the corresponding 20 values of -\begin_inset Formula $K_{10}^{-}$ -\end_inset - -; - since -\begin_inset Formula $K_{10}^{-}$ -\end_inset - - has the same distribution as -\begin_inset Formula $K_{10}^{+}$ -\end_inset - -, - we could lump together the 40 values thus obtained (that is, - 20 of the -\begin_inset Formula $K_{10}^{+}$ -\end_inset - -'s and 20 of the -\begin_inset Formula $K_{10}^{-}$ -\end_inset - -'s), - and a KS test could be applied so that we would get new values -\begin_inset Formula $K_{40}^{+}$ -\end_inset - -, - -\begin_inset Formula $K_{40}^{-}$ -\end_inset - -. - Discuss the merits of this idea. -\end_layout - -\begin_layout Standard -\begin_inset ERT -status open - -\begin_layout Plain Layout - - -\backslash -answer -\end_layout - -\end_inset - -The issue here is that the 40 points would not be independent; - if the maximum of 5 is low, - the minimum of 5 must be necessarily lower, - the probability of it being higher is 0. -\end_layout - -\begin_layout Standard -\begin_inset ERT -status open - -\begin_layout Plain Layout - - -\backslash -rexerc10[20] -\end_layout - -\end_inset - -Suppose a chi-square test is done by making -\begin_inset Formula $n$ -\end_inset - - observations, - and the value -\begin_inset Formula $V$ -\end_inset - - is obtained. - Now we repeat the test on these same -\begin_inset Formula $n$ -\end_inset - - observations over again (getting, - of course, - the same results), - and we put together the data from both tests, - regarding it as a single chi-square test with -\begin_inset Formula $2n$ -\end_inset - - observations. - (This procedure violates the text's stipulation that all of the observations must be independent of one another.) How is the second value of -\begin_inset Formula $V$ -\end_inset - - related to the first one? -\end_layout - -\begin_layout Standard -\begin_inset ERT -status open - -\begin_layout Plain Layout - - -\backslash -answer -\end_layout - -\end_inset - -Let -\begin_inset Formula $Y'_{s}=2Y_{s}$ -\end_inset - - be the number of observations of category -\begin_inset Formula $s$ -\end_inset - - in the second test, - the second value of -\begin_inset Formula $V$ -\end_inset - - is -\begin_inset Formula -\[ -V'=\sum_{s=1}^{k}\frac{(Y'_{s}-2np_{s})^{2}}{2np_{s}}=\sum_{s=1}^{k}\frac{(2Y_{s}-2np_{s})^{2}}{2np_{s}}=2V. -\] - -\end_inset - - -\end_layout - -\begin_layout Standard -\begin_inset ERT -status open - -\begin_layout Plain Layout - - -\backslash -exerc11[10] -\end_layout - -\end_inset - -Solve exercise 10 substituting the KS test for the chi-square test. -\end_layout - -\begin_layout Standard -\begin_inset ERT -status open - -\begin_layout Plain Layout - - -\backslash -answer -\end_layout - -\end_inset - -This time, - after sorting the -\begin_inset Formula $2n$ -\end_inset - - observations -\begin_inset Formula $X'_{1},\dots,X'_{2n}$ -\end_inset - -, - we have -\begin_inset Formula $X_{j}=X'_{2j-1}=X'_{2j}$ -\end_inset - -, - so -\begin_inset Formula -\[ -K_{2n}^{+}=\sqrt{2n}\max_{1\leq j\leq2n}\left(\frac{j}{2n}-F(X'_{j})\right)=\sqrt{2n}\max_{1\leq j\leq n}\left(\frac{2j}{2n}-F(X_{j})\right)=\sqrt{2}K_{n}^{+}, -\] - -\end_inset - -and similarly, -\begin_inset Formula -\[ -K_{2n}^{-}=\sqrt{2n}\max_{1\leq j\leq2n}\left(F(X'_{j})-\frac{j-1}{2n}\right)=\sqrt{2n}\max_{1\leq j\leq n}\left(F(X_{j})-\frac{2j-2}{2n}\right)=\sqrt{2}K_{n}^{-}. -\] - -\end_inset - - -\end_layout - -\end_body -\end_document |
