Chapter 8 Distribution of sample statistics and confidence intervals

This chapter is devoted to the distribution of sample statistics (mainly of the sample mean and of an observed proportion), which represent random quantities themselves, if one considers their variation across different random samples. You will learn how the variability of the sample statistics and hence the precision of statements about the population depend on the sample size and the variation of the underlying variable. The idea of describing the accuracy of a sample estimate with an interval leads to the central statistical concept of "confidence interval".

Educational objectives

You know the essential properties of sample means and other sample statistics. You are familiar with the term "standard error" and you are able to explain it in your own words. You know the law by which the standard error decreases with increasing sample size and how the distribution of the means of a quantitative variable across many large random samples from the same population can be described.

You are familiar with the important concept of the "\(95\%\)-confidence interval" and can explain it in your own words. You are able to calculate \(95\%\)- confidence intervals yourself (which are valid for large sample sizes) based on sample statistics and their standard errors. You know how to construct exact confidence intervals (i.e. which are also valid for small sample sizes) for proportions (relative frequencies).

Key words: sampling distribution, standard error, square root of n law, approximate normal distribution, confidence interval

Previous knowledge: population (chap. 1), sample statistics (chap. 3/4), normal distribution (chap. 6), binominal distribution (chap. 7)

Central questions: How can we make inferences on a population parameter based on the respective statistic from a random sample (e.g. the sample mean of the body height of men or the observed proportion of persons with hay fever) and make exact statements despite the randomness of the sample?

8.1. Introduction

In statistics, the goal is not only to describe the data of a survey or trial in a clear and comprehensible way (descriptive statistics), but also to make inferences from the collected data on the underlying population. This is the role of so-called "inferential statistics". One of its main tasks is to provide valid statements on population parameters based on corresponding sample statistics. As we already know from chapter 4, this is only possible if the data come from a random sample. If we make inferences on a population parameter based on a sample statistic, we have to consider that these two values do not agree in general. If we take the true population parameter as a benchmark, each of its estimates from a random sample is fraught with a random error. This random error is defined as the following difference \[ \text{random error = observed statistic - population parameter} \] It is thus important to know the distribution of these random errors across different random samples of the same size from the given population. This distribution has the exact same shape as the distribution of the statistic itself, referred to as "sampling distribution" of the statistic. We could identify the sampling distribution of a statistic by drawing a larger number of random samples of the same size from the given population and calculating the statistic in each of the samples. In practice, however, this is hardly possible. Therefore we make use of the probability calculus or of artificially generated (so-called simulated) random samples from a virtual population.

8.2. The variability of the sample mean

You may ask yourself why "sample mean" appears in its singular form in this title. This is not a linguistic error, but should emphasise that we are talking about the sample characteristic "mean value" which varies from one sample to another, much the same as body height varies from one person to another. In the case of body height, the observational units are persons, whereas the observational units in the case of the sample mean are different random samples of the same size from a given population. We are thus interested in the distribution of the mean across all possible random samples of a certain size from a given population. In this chapter, the observational units are thus no longer single individuals, but random samples.

With the applet "Square root of n law* you can observe how the variability of the sample mean decreases with increasing sample size.

Figure 8.2. Applet "Law of the square root of n"

With this applet, you can learn how the standard deviation of repeated sample means depends on the sample size n. You can specify the number S of samples and the sample size n. If you press the key "Simulation", S dots are plotted in vertical direction. Each of them represents the mean value of age in a random sample of size n. On the right hand side, the sample size n, the mean of the S sample means and their standard deviation is indicated. You best start with n = 1, in which case indivudal observations of age are generated (notice that individual observations are samples of size 1). After this, you can successively increase n, e.g., by doubling n each time. Try to discover the pattern by which the standard deviation of the sample means decreases with increasing sample size n.

Exercise:

Choose \( S = 100 \) (\( \rightarrow \) entry field "number of samples") and start with \( n = 1 \) (\( \rightarrow \) entry field "sample size"). Samples of size 1 are individual observations. Therefore, the resulting figure illustrates the distribution of individual values of the respective variable. Next choose \( n = 4 \), followed by \( n = 8, n = 16, n = 32 \) and \( n = 64. \), and observe what happens to the variability of the points (now representing sample means and no longer individual values within a sample).

Synopsis 8.2.1

The means of random samples of the same size n from a given population have the following properties:

a) The means of random samples vary around the population mean \( \mu \).

b) The variability of the sample means decreases with increasing sample size \( n \).

c) If the sample only represents a small fraction of the population, the standard deviation of the sample means is inversely proportional to \(\sqrt{n}\) (square root of \(n\) law)

d) The standard deviation of the sample means, referred to as "standard error" (abbr. SE), satisfies \[ \text{SE} = \sigma / \sqrt{n} \] where \( \sigma \) denotes the standard deviation of the variable in the population and \( n \) the sample size.

Statements b), c) and d) apply to all quantitative variables whose values cannot become arbitrarily large.

The abbreviation SE is derived from the term "standard error". Sometimes the abbreviation SEM (standard error of the mean) is used for the standard error of the sample mean. The denominator of the formula under d) reflects the preceding statement c). Furthermore the following applies: the bigger the variability of the individual values, the bigger the variability of the sample means calculated from them. It is thus plausible that the standard deviation of the variable itself appears in the numerator. In practice, where \( \sigma \) is generally unknown - just like the population mean \(\mu \) itself -, \( \sigma \) is estimated by the standard deviation \( s \) of the variable in the observed sample.

8.3. Convergence to the normal distribution

In the preceding section you learned how the spread of the sampling distribution of the mean decreases with increasing sample size. In the following section you will learn how the shape of the sampling distribution of the mean changes with increasing sample size. Open the applet "Central limit theorem", which illustrates an important law of probability theory..

Figure 8.3. Applet "Central limit theorem"

This applet illustrates the so-called central limit theorem. This theorem states that the distribution of sample means comes closer and closer to a normal distribution as the size of the samples gets larger and larger. This can be seen for four different types of variables:
a) random numbers which are uniformly distributed between 0 and 1, i.e., whose histogram has a rectangular form.
b) leukocyte counts whose distribution is right skewed.
c) observations from a mixture of two normal distributions which has two peaks (modes).
d) observations from the classical Laplace distribution whose histogram has more area in the center and in the tails than the Gaussian bell curve.
You can specify the distribution in the drop down list and the size of the samples in the first entry field. If you leave the first entry field empty (or enter 1) and press the key "Simulation", the program generates 1000 observations from the respective distribution. The upper panel shows the QQ-plot of these observations and the lower panel a histogram of them with the corresponding Gaussian bell curve added. This bell curve is defined by the mean of the observations and their standard deviation. If the number m entered is larger than 1, then 1000 x m observations are generated and grouped into 1000 samples of size m. The means of these 1000 samples are computed and their QQ-plot and histogram are displayed. The lower panel is renewed with each new simulation, but the QQ-plots in the upper panel remain so that you can observe that they come closer and closer to a straight line as you increase the sample size. Since linear QQ-plots are characteristic of normally distributed variables, this demonstrates the central limit theorem empirically.
As the variability of the sample means decreases with the sample size, the width of the QQ-plot and of the histogram decrease when you increase the sample size. There is another applet "central limit theorem (2)", which you find under the menu "Applets", where the scale is shrunk around the mean such that the width of the QQ-plot and of the histogram doesn't decrease with increasing sample size.
You can prevent the overlaying of the QQ-plots by pressing the reset key before the next simulation. This clears both panels. If you switch to another distribution, the panels will be cleared automatically.

Question Nr. 8.3.1 What can you observe?
First choose the initial distribution 1), as described in the instructions of the applet and increase the sample size gradually from 1 to 5, 10, 20, 40, 80.
How does the Q-Q plot change with increasing sample size? What does this mean? Note down your observations.
Repeat these steps for distributions 2) and 3).
How fast does the shape of the Q-Q plot change? Are there differences compared to what you observed in the first step?

We can summarise these observations in the following synopsis.

Synopsis 8.3.1

As long as the observed samples only represent a small fraction of the population, the following statements apply:

a) The sampling distribution of the mean approaches a Gaussian bell curve with increasing sample size.

b) This "convergence" of the sampling distribution of the mean to the normal distribution generally occurs relatively fast (i.e. already with relatively small sample sizes) for variables with an approximately symmetrical distribution and a Q-Q plot that does not have a pronounced S-shape. For variables with a skewed distribution or a Q-Q plot with a pronounced S-shape, larger sample sizes are required.

These statements hold for all quantitative variables which cannot take arbitrarily large values.

Statement a) summarizes one of the central laws of probability theory, which is known under the name "central limit theorem".

8.4. The confidence interval

In chapter 6, we saw that \(95\%\) of all observations of a normally distributed variable lie in the range \( \mu \pm 1.96 \times \sigma \), where \( \mu \) and \( \sigma \) denote the mean and the standard deviation, respectively, of the variable in the underlying population. We now know that sample means are approximately normally distributed for sufficiently large sample sizes \( n \) and have a standard deviation of \[ SE = \sigma/\sqrt{n} \,\,\text{(standard error)}.\] Hence the following holds true for approx. \(95\%\) of all sample means \( \bar{x} \) from random samples of size \( n \): \[ \bar{x} \, \text{lies in the interval} \, [\mu - 1.96 \times SE, \mu + 1.96 \times SE ] .\] If we now substitute the unknown population mean \( \mu \) in the interval above with one of the concrete sample means \( \bar{x} \), we get an interval of equal width \[ [ \bar{x} - 1.96 \times SE, \, \bar{x} + 1.96 \times SE ] .\] However, this interval varies with \( \bar{x} \) from one sample to another. It has the important property that it contains the unknown population mean \( \mu \) in \(95\%\) of all samples. In other words: we can be \(95\%\) confident that this interval contains the true population mean value \( \mu \). Therefore this interval is called "\(95\%\)-confidence interval" for the population mean \( \mu \). It may not be evident that this property of the confidence interval derives directly from the first statement about the distribution of the sample means \( \bar{x} \) around \( \mu \).

Let's thus take a look at the following analogy: we compare the samples, whose mean values \( \bar{x} \) lie in the interval \( (\mu - 1.96 \times SE, \, \mu + 1.96 \times SE) \), with the number of all people whose place of domicile ( \( \sim \bar{x} \)) lies within 100 km of their place of birth (\( \sim \mu \)). The exact same group of people can say that their place of birth (\( \sim \mu \)) is within 100 km of their place of domicile ( \( \sim \bar{x} \)), i.e., that it lies within a radius of 100 km of the place of domicile (\( \sim \) confidence interval).

For math enthusiasts we also give the respective formal derivation: The statement \[ \bar{x} \, \text{lies in} \, [ \mu - 1.96 \times \sigma/\sqrt{n}, \, \mu + 1.96 \times \sigma/\sqrt{n} ] \] is equivalent to \[ \mu - 1.96 \times \sigma/\sqrt{n} \leq \bar{x} \leq \mu + 1.96 \times \sigma/\sqrt{n} .\] This inequality is now transformed as follows: We first subtract \( \mu \) in each of the three parts of the inequality and get \[ -1.96 \times \sigma/\sqrt{n} \leq \bar{x} - \mu \leq + 1.96 \times \sigma/\sqrt{n} .\] In a next step we multiply each of the three parts with (-1), which leads to an inversion of the inequalities \[ 1.96 \times \sigma/\sqrt{n} \geq \mu - \bar{x} \geq -1.96 \times \sigma/\sqrt{n} .\] Now we add \( \bar{x} \) and obtain \[ \bar{x} + 1.96 \times \sigma/\sqrt{n} \geq \mu \geq \bar{x} - 1.96 \times \sigma/\sqrt{n} .\] Expressing this with \( \leq \)-relations, we finally get \[ \bar{x} - 1.96 \times \sigma/\sqrt{n} \leq \mu \leq \bar{x} + 1.96 \times \sigma/\sqrt{n} .\] This means that the population mean \( \mu \) lies between the following two limits \[ \bar{x} - 1.96 \times \sigma/\sqrt{n} \,\, \text{and} \,\, \bar{x} + 1.96 \times \sigma/\sqrt{n} .\] Therefore, the following interval \[ [ \bar{x} - 1.96 \times \sigma/\sqrt{n}, \, \bar{x} + 1.96 \times \sigma/\sqrt{n} ] \] covers the population mean \( \mu \) in \(95\%\) of all random samples of size \( n \).

Definition 8.4.1

The interval \( \bar{x} \pm 1.96 \times SE \), which contains the unknown population mean \( \mu \) in 95% of all random samples, is called "\(95\%\)-confidence interval" for the population mean \( \mu \).

Synopsis 8.4.1

The (unknown) population mean \( \mu \) lies in the 95%-confidence interval \[ [ \bar{x} - 1.96 \times \frac{\sigma}{\sqrt{n}}, \, \bar{x} + 1.96 \times \frac{\sigma}{\sqrt{n}} ] .\] in approx. \(95\%\) of all random samples, provided that the sample size \( n \) is large enough. For each individual interval we can be \(95\%\) confident that it contains the population mean value \( \mu \). This statement applies to all quantitative variables which cannot take arbitrarily large values.

The applet "Confidence intervals of sample means" illustrates this important property of the confidence interval.

Figure 8.4. Applet "Confidence intervals of sample means"

This applet illustrates the scatter of confidence intervals of sample means around the underlying population mean μ. Confidence intervals cover μ with a fixed "probability", defined by the confidence level (denoted by 1 - α). Usually, the confidence level is 0.95 (or 95%). The term probability is put in quotation marks because we do not have repeated samples in practice, and an individual confidence interval either contains μ or not. This is different if data are repeatedly simulated from a "virtual" population whose mean value μ is known. In our example, μ is represented by the horizontal line. Individual values used to compute sample means are simulated from a normal distribution with mean value μ and standard deviation σ.

You can specify the sample size n and the number S of random samples to be generated. However, n must be larger than 1 to be able to calculate confidence intervals. You can also change the confidence level using the drop down menu. If you press the key "Simulation", S confindence intervals are plotted as vertical lines with a tiny dot in the middle. These dots are the means of simulated samples of size n.

With small sample sizes, you can observe considerable variations of the lengths of the confidence intervals, because the standard deviations, from which the standard errors are estimated, vary considerably from sample to sample. If the confidence level is 1 - α, then a proportion of 1 - α of confidence intervals will on average include the underlying population mean μ. Confidence intervals including μ are represented in green color, those missing μ are red.

The coverage property of confidence intervals is independent of the sample size. This is because the width of the confidence intervals adapts to the sample size. The confidence intervals get shorter if the sample size n is increased.

This applet shows that, on average, approx. \(95\) of \(100\) samples provide a \(95\%\)-confidence interval, which contains the true population mean \( \mu \). This value, which is unknown in reality, is represented by a horizontal line. Conversely, \(5\%\) of the \(95\%\)-confidence intervals do not include \( \mu \), on average. The applet enables the choice of other confidence levels (i.e., \( 0.9 \) and \( 0.99 \)). Naturally the confidence intervals get wider with increasing and narrower with decreasing confidence level. As an alternative to the classical \(95\%\)-confidence interval, the following confidence intervals are also encountered:

\(90\%\)-confidence interval = sample mean value \( \pm 1.645 \times SE \)

\(99\%\)-confidence interval = sample mean value \( \pm 2.576 \times SE \)

\(99.9\%\)-confidence interval = sample mean value \( \pm 3.291 \times SE \)

The factors used with the standard error are the \(95\)th, \(99.5\)th and \(99.95\)th percentiles, respectively, of the standard normal distribution

What has been said for the sample mean also applies to many other sample statistics (median, percentiles, standard deviation, variance, interquartile range, etc.) under relatively weak additional conditions. In these cases, however, the standard error can seldom be calculated in an easy way. However, there are also sample statistics for which the statements b) to e) of the following synopsis do not hold (e.g. maximum, minimum and range).

Synopsis 8.4.2

In the following the symbol \( \theta \) will be used to denote the population parameter and the symbol \( \hat{\theta} \) to denote the respective sample statistic. The following statements apply:

a) If different random samples are drawn, the statistic \( \hat{\theta} \) varies around the respective population parameter \( \theta \).

b) The standard error SE of many statistics \( \hat{\theta} \) decreases with increasing sample size \( n \).

As long as the observed samples only represent a small fraction of the population, the following additional statements apply under relatively weak conditions:

c) The standard error \( SE \) of many statistics \( \hat{\theta} \) is essentially inversely proportional to the square root of the sample size \( n \).

d) The sampling distribution of many statistics \( \hat{\theta} \) approaches a Gaussian bell curve around the respective population parameter \( \Theta \) with increasing sample size \( n \).

Hence the following applies to such sample statistics \( \hat{\theta} \) in sufficiently large samples:

e) The \(95\%\)-confidence interval \[ [\hat{\theta} - 1.96 \times SE, \, \hat{\theta} + 1.96 \times SE] \] contains the population parameter \( \Theta \) in approx. \(95\%\) of all random samples.

Important: These statements only hold true for random samples.

The confidence interval of the sample mean can be calculated even more precisely

The calculation rule for confidence intervals of a sample mean is only accurate for large sample sizes \( n \). The standard error is generally unknown and must be estimated from the sample data using the formula \( s / \sqrt{n} \) where \( s \) denotes the standard deviation of the variable \( X \) in the sample. As \( s \) also varies from sample to sample, another random error creeps in (in addition to the one of the mean). This random error has a considerable impact in smaller samples, but loses importance with increasing sample size. For samples of size > 120 it becomes negligible. For sample sizes \( \leq 120 \) it is advisable to use a factor larger than 1.96 in the formula of the \(95\%\)-confidence interval. This factor is derived from the so-called "\(t\)- distribution" with \( n-1 \) degrees of freedom, where \( n \) again denotes the sample size. You can memorise that the number of degrees of freedom is equal to the denominator of the variance. For the \(95\%\)-confidence interval, this factor equals the \(97.5\)th percentile of the respective \(t\)-distribution. The smaller \( n \) , the wider the respective \(t\)-distribution and the larger the factor to be used.

The \(95\%\)-confidence interval for the population mean \( \mu \) is then given by \[ \bar{x} \pm t_{0.975,n-1} \times s/\sqrt{n} \] where \( t_{0.975,n-1} \) denotes the \(97.5\)th percentile of the \(t\)-distribution with \( n-1 \) degrees of freedom.

The following table gives the values of \( t_{0.975,n-1} \) for different degrees of freedom

Table 3.1: Some t-quantiles
df = number of degrees of freedom	97.5th percentile (\( t_{0.975,df} \))
5	2.571
10	2.228
20	2.086
50	2.009
100	1.984

For instance, in a random sample of \(6\) men for which the mean value and the standard deviation of the body height are \(178\) cm and \(7.5\) cm, the value \(2.571\) must be used to multiply the standard error. The \(95\%\)-confidence interval of the mean body height in the population from which this sample was drawn can be calculated based on the above formula as follows: \[ 178 \pm 2.571 \times 7.5 / \sqrt{6} = [170.1, 185.9] \]

The confidence intervals are calculated based on this formula in the applet "Confidence intervals of sample means". Since the standard deviation now also varies from one sample to another, not only the centres but also the widths of the confidence intervals display random fluctuations. This can be clearly seen if he sample size \( n \) is small. Statistics programs also use \(t\)-quantiles to calculate confidence intervals for population mean values. In the absence of a statistics program, you can use Excel to calculate \(t\)-quantiles via the function \(T.INV(q ; df)\) , where \( q = 0.975 \) for the \(95\%\)-confidence interval, \( q = 0.95 \) for the \(90\%\)-confidence interval and \( q = 0.995 \) for the \(99\%\)-confidence interval.

8.5. Approximate confidence intervals for probabilities

Dr. Frank N. Stein has read in a medical journal that a new medicine, which was recommended to him by a pharmaceutical representative, caused adverse reactions in \(18\) out of \(200\) patients during a treatment of one month. Based on this result, he estimates the probability of adverse reactions of this medicine during a one month treatment to be \(18/200 = 0.09\) or \(9\%\). He considers this value to be too high but he would consider a probability of adverse reactions of \(5\%\) as acceptable. Because he is aware of the fact that sample results are affected by random errors, he wants to calculate the \(95\%\)-confidence interval for the probability in question. To do so, he needs to know the standard error of the observed proportion of adverse reactions. (In this section, we will mostly use the term "proportion" for a relative frequency). In a statistics textbook he finds the following formula for the standard error of a proportion: \[ SE = \sqrt{\frac{p \ (1-p)} {n}} \] where \( n \) denotes the sample size and \( p \) the observed proportion.

In his case, \( n = 200 \) and \( p = 0.09 \). By inserting these values into the formula he gets \( SE = 0.0202 \). Dr. Frank N. Stein now calculates the \(95\%\)- confidence interval for the unknown probability as \( 0.09 \pm 1.96 \times 0.0202 \). This interval has a lower limit of \(0.0503\) and an upper limit of \(0.1297\). Dr. Stein can thus be \(95\%\) confident that the probability of adverse reactions lies in the interval \( [0.0503, 0.1297] \). If the probability of adverse reactions were in reality only \(5\%\) or even lower, this would represent one of the rare instances (occurrung in only \(5\%\) of all cases), in which the \(95\%\)-confidence interval does not include the true population parameter. Consequently, Dr. Stein cannot believe that adverse reactions would only occur in \(5\%\) of all patients.

For maths enthusiasts:

You might have noticed the similarity of this formula with the one of the standard deviation of the binominal distribution. An event which occurs with a probability \( \pi \) will occur on average \( n \times \pi \) times in a series of \( n \) observations, with a standard deviation of \( \sqrt{n \times \pi \times (1 - \pi)} \). The observed proportion \( p \) is obtained by dividing the number of events by \( n \). Therefore, the observed proportion will on average be \( \frac{n \times \pi}{n} = \pi \) and the standard deviation of \( p \) will be \[ \frac{ \sqrt{n \times \pi \times (1 - \pi)}} {n} = \sqrt{\frac{n \times \pi \times (1 - \pi)}{n^2} } = \sqrt{\frac{\pi \times (1 - \pi)}{n} } \] As we don't know \( \pi \), we have to replace it by its sample estimate \( p \), which provides the formula for the standard error which Dr. Stein used.

Synopsis 8.5.1

If a specific outcome occurs with a relative frequency \( p \) in a random sample of size \( n \), if the sample only represents a small fraction of the population and if \( n \times p \times (1 - p) \geq 10 \) holds, then

a) the standard error of \( p \) can be estimated by \[ \sqrt{\frac{p \times (1 - p)}{n}} \]

b) the \(95\%\)-confidence interval for the probability \( \pi \) of this outcome in the population can be approximated by \[ [ p - 1.96 \times \sqrt{\frac{p \times (1 - p)}{n}} , \, p + 1.96 \times \sqrt{\frac{p \times (1 - p)}{n}} ] .\]

8.6. Exact confidence intervals for probabilities

What can be done if the approximate confidence interval of an observed proportion \( p \). is not accurate enough, which is the case if \( n \times p \times (1 - p) \lt 10 \) holds (cf. chapter 7)?

With the applet to the genotype example you will learn how exact confidence intervals for an unknown probability can be constructed. To do this, we consider the following example:

A certain genotype \( G \), which is known to be associated with an increased risk of a disease \( D \), was found six times in a random sample of \(30\) people from a given population. The observed proportion of this genotype was thus \(0.2\) or \(20\%\).

With the applet "Applet to the genotype example" you can find an exact \(95\%\)-confidence interval for \( \pi \).

Figure 8.5. Applet to the genotype example

This applet is linked to the genotype example. It focuses on a fixed number of six events (i.e., the six observations of genotype G). By default, the sample size n is 30, as in the genotype example. However, you can vary n. Of course, for six events to be possible, you must select n ≥ 6. You must also specify an event probability π in the second entry field. If you press the submit key, then the bar chart of probabilities of the binomial distribution for the selected sample size n and event probability π is drawn. Each time, the probabilities P(X<=6) and P(X>=6) are computed in the background. If P(X<=6) < P(X>=6), then the bars for X <= 6 are colored in red and P(X<=6) is displayed at the bottom. Otherwise, bars for X >= 6 are colored in red and P(X>=6) is displayed.

If you select n = 30, as in the genotype example, you can see how the probability of the red bars changes if you vary π. This enables you to determine the limits of the exact 95%-confidence interval for the true probability π of genotype G in the underlying population. Allowing for an error rate of 5%, a hypothetical probability π is "compatible" with observing 6 carriers of the genotype G in a random sample of size 30, if the corresponding tail probability is larger than 0.025.

First examine what happens if you assume a value of \(0.4\) for \( \pi \)

Now suppose that \( \pi \) equals \(0.1\).

Next you can use the applet "Exact and approximate confidence intervals of proportions" to see how the number \( n \) of trials (or observational units) and the number \( m \) of observed events (or outcomes of interest) influences the shape of the exact \(95\%\)-confidence interval for the unknown probability \( \pi \) of the event (or outcome of interest in the underlying population).

Figure 8.6. Applet "Exact binomial confidence intervals"

This applet enables comparing exact and approximate confidence intervals for an unknown binomial probability π. You can specify the number n of trials (or sampled observational units) and the number m of observed events (outcomes of interest) in the respective sample. When you press the submit key, the exact and the approximate confidence interval for the probability π of the event (or the outcome of interest in the underlying population) are displayed.

(A) First keep the ratio m / n fixed and increase n and m successively (e.g., m = 2 / n = 10, m = 4, n = 20, m = 10, n = 50, etc.). How do the shapes of the exact and the approximate confidence interval change with increasing n and m. What happens with the difference between the two intervals?

(B) In a second step, keep n fixed and vary m. Try to answer the same questions as under (A). It is advisable to choose n >= 40, as n x 0.5 x 0.5 would be smaller than 10 otherwise.

In a second step, you can also vary the proportion of observed events. In this case it is best to fix the number of trials \( n \) and to vary the number \( m \) of observed events. Ideally, \( n \) should be at least \( 40 \), as \( 40 \times \frac{m}{n} \times (1 - \frac{m}{n}) \) would otherwise be smaller than \( 10 \) for all possible choices of \( m \).

8.7. Summary

In this chapter, the observational units were no longer individual persons or objects, but random samples of persons or objects. The measures used to describe the individual samples are called sample statistics. They are used to estimate the respective unknown population parameters.

Due to chance, the sample statistics change from one sample to another. The so-called standard error (SE) is used as a measure of this variation. The size of the standard error depends on the sample size: the larger the sample size, the smaller the standard error. In fact, the standard error of a sample mean or a proportion is inversely proportional to the square root of the sample size, and this essentially also holds for many other sample statistics.

The sample size also influences the distribution of the sample statistics (i.e., the sampling distribution). For many statistics, the sampling distribution approaches a Gaussian bell curve (convergence to the normal distribution), as the sample size increases. This leads to the important concept of the confidence interval. A confidence interval for a specific parameter of the population is constructed around the respective sample statistic. Its limits are defined in such a way that they include the unknown population parameter in a pre-defined percentage of samples, usually \(95\%\). For many sample statistics, the \(95\%\)-confidence interval is computed according to the formula "\(\text{sample statistic} \pm 1.96 \times SE \)".

Figure 8.7. Scheme of chapter 8 with gaps

Question Nr. 8.2.1 How does the standard deviation of the points (i.e., representing sample means) change, if you vary the sample size?
	The standard deviation decreases with increasing sample size.
	The standard deviation decreases to about its half with a doubling of the sample size.
	The sample size must be about four times as big, in order to reduce the standard deviation by half.
	The standard deviation increases with increasing sample size.
	The standard deviation doubles if the sample size is decreased to a fourth.

Question Nr. 8.4.1.1 A random sample of 200 adult men resulted in a mean body height of 176.8 cm and a standard deviation of 7.4 cm. Calculate the confidence intervals for a confidence level of 95%.
	(175.9cm, 177.7cm)
	(174.8cm, 178,8cm)
	(176.3cm,177.3cm)
	(175.8cm, 177.8cm)
	(162.3cm, 191.3cm)

Question Nr. 8.4.1.2 A random sample of 200 adult men resulted in a mean body height of 176.8 cm and a standard deviation of 7.4 cm. Calculate the confidence intervals for a confidence level of 90%.
	(175.9cm, 177.7cm)
	(164.6cm, 189.0cm)
	(176.1cm, 177.5cm)
	(175.8cm, 177.8cm)
	(172.5cm, 181.1cm)

Question Nr. 8.4.1.3 A random sample of 200 adult men resulted in a mean body height of 176.8 cm and a standard deviation of 7.4 cm. Calculate the confidence intervals for a confidence level of 99%.
	(175.6cm, 178.0cm)
	(157.7cm, 195.9cm)
	(175.5cm, 178.1cm)
	(159.6cm, 194.0cm)
	(176.4cm, 177.2cm)

Question Nr. 8.4.2 A statistics program provides you with the following information: • sample median of body height = 176.6 cm • standard error of the median = 1.75 cm • n = 25. Which are the limits of the 95%-confidence interval for the median of body height in the population from which the random sample was drawn.
	(174.85 cm, 178.35 cm)
	(175.9 cm, 177.3 cm)
	(173.1 cm, 180.1 cm)

Question Nr. 8.6.1 A certain genotype G, which is associated with an elevated risk of disease D, was found six times in a random sample of 30 persons from population P. Thus, the sample frequency of this genotype was 0.2 or 20%. What is the proportion π of this genotype in the underlying population P? (Higher or lower than 0.2 or exactla 20%?)
	The proportion is likely to be higher than 20%.
	This question cannot be answered.
	The relative frequency of the genotype in population P is exactly 20%.
	The proportion is likely to be lower than 20%.

Question Nr. 8.6.2 Let us now assume that the proportion π of carriers of the genotype in the underlying population P equals 0.4. How many carriers of G would we then expect on average in a random sample of 30 persons from population P?
	exactly 12 carriers.
	about 12 carriers.
	There is not enough information for answering this question.
	9 carriers.

Question Nr. 8.6.3 Assume again that the true proportion π of carriers of the genotype G in the underlying population P equals 0.4. With which probability would we then expect 6 or less carriers of G in the sample?
	with a probability of 0.983.
	with a probability of 0.83
	with a probability of 0.17
	with a probability of 0.017.

Question Nr. 8.6.4 Let us now assume that the proportion π of carriers of the genotype in the underlying population P equals 0.1. How many carriers of G would we then expect on average in a random sample of 30 persons from population P?
	about 3.
	exactly 3.5.
	exactly 3.
	about 3.5.

Question Nr. 8.6.5 Let us still assume that the proportion π of carriers of the genotype in the underlying population P equals 0.1. With which probability would we expect 6 or more carriers of G in the sample under this hypothesis?
	0.091.
	0.073.
	0.927
	0.909.

Question Nr. 8.6.6 Now try to construct the 95%-confidence interval for the true value of π. Note: Each of the limits of the interval should be chosen in such a way that, if π is hypothesized to coincide with the respective limit, the observed or an even less likely frequency "on the side of the observed frequency" would occur with a probability ≤ 2.5%.
	The limits of the interval are 0.091 and 0.357.
	The limits of the interval are 0.057 and 0.343.
	The limits of the interval are 0.077 and 0.385.
	The limits are 0.1 and 0.4.

Question Nr. 8.6.7 In a first stage, leave the value p of the observed frequency at 0.2. Insert different values for n and observe how the approximate and the exact interval and the difference between the two intervals change with increasing n (but do not exceed 200, because the calculation time may become too long). What do you observe?
	The approximate confidence interval (black) gets closer and closer to the exact one (red) as the sample size n increases.
	The exact confidence interval contains the approximate one.
	Both confidence intervals get shorter as the sample size n increases.
	As compared to the approximate confidence interval, the exact one is slightly shifted toward the center.

Question Nr. 8.6.8 How does the shape and the location of the exact 95%-confidence interval change for a fixed sample size n, if you vary the observed relative frequency p? And what happens to the two intervals if you choose p close to 0 or 1?
	The confidence intervals get larger if p approaches 0.5.
	The confidence interval gets wider as p increases.
	The agreement between the approximate and the exact confidence interval gets better and better as p approaches 0.5.
	As p approaches 0.5, the approximate confidence interval becomes more and more symmetrical, too.