A point estimate is a single value given as the estimate of a population parameter that is of interest, for example, the mean of some quantity. An interval estimate specifies instead a range within which the parameter is estimated to lie. Interval estimates can be contrasted with point estimates. Confidence intervals are commonly reported in tables or graphs along with point estimates of the same parameters, to show the reliability of the estimates.

Suppose we want to estimate an actual population mean μ. As you know, we can only obtain $ \overline{x} , $ the mean of a sample randomly selected from the population of interest. We can use $ \overline{x} $ to find a range of values:

\[ \mbox{Lower value} < \mbox{population mean } \mu < \mbox{Upper value} \] that we can be really confident contains the population mean μ. The range of values is called a confidence interval. The general form of most confidence intervals is \[ \mbox{Sample estimate} \pm \mbox{margin of error} . \] That is, \[ \mbox{the lower limit } L \mbox{ of the interval } = \mbox{estimate} - \mbox{margin of error} , \] and \[ \mbox{the upper limit } U \mbox{ of the interval } = \mbox{estimate} + \mbox{margin of error} ,. \] Once we have obtained the interval, we can claim that we are really confident that the value of the population parameter is somewhere between the value of L and the value of U.

The number we add and subtract from the point estimate is called the margin of error. The question arises: What number should we subtract from and add to a point estimate to obtain an interval estimate? The answer to this question depends on two considerations:

The standard deviation $ \sigma_{\overline{x}} $ of the sample mean, $ \overline{x} . $
The level of confidence to be attached to the interval.

First, the larger the standard deviation of $ \overline{x} , $ the greater is the number subtracted from and added to the point estimate. Thus, it is obvious that if the range over which $ \overline{x} $ can assume values is larger, then the interval constructed around $ \overline{x} $ must be wider to include μ.

Second, the quantity subtracted and added must be larger if we want to have a higher confidence in our interval. It is a custom to attach a probabilistic statement to the interval estimation. This probabilistic statement is given by the confidence level. An interval constructed based on this confidence level is called a confidence interval. The confidence interval is given as

\[ \mbox{point estimate } \pm \mbox{margin of error} . \] The confidence level associated with a confidence interval states how much confidence we have that this interval contains the true population parameter. The confidence level is denoted by $ (1- \alpha )\,100\% , $ where α is the Greek letter alpha. When expressed as probability, it is called the confidence coefficient and is denoted by 1 - α. The α is called the significance level.

More generally and more precisely, we can say that 100(1-α)% of all samples of size n have means within the interval:

\[ \left[ \overline{x} - z_{\alpha /2} \cdot \frac{\sigma}{\sqrt{n}} , \ \overline{x} + z_{\alpha /2} \cdot \frac{\sigma}{\sqrt{n}} \right] , \] where $ z_{\alpha /2} $ is the value of the standard normal distribution giving probability α/2, that is, \[ \frac{1}{\sqrt{2\pi}} \, \int_{-\infty}^{z_{\alpha /2}} {\text d}t \, e^{-t^2 /2} = \frac{\alpha}{2} . \] R has a special command to calculate z-values:

As you see, for standard normal distribution, R provides the numbers: \[ \frac{1}{\sqrt{2\pi}} \, \int_{-\infty}^{-1.959964} {\text d}t \, e^{-t^2 /2} = 0.025 \quad\mbox{and} \quad \frac{1}{\sqrt{2\pi}} \, \int_{-\infty}^{1.959964} {\text d}t \, e^{-t^2 /2} = 0.975 . \] For normal distribution with given mean μ = 2.5 and standard deviation σ = 4.75, we can obtain similar values: \[ \frac{1}{\sigma\,\sqrt{2\pi}} \, \int_{-\infty}^{-6.809829} {\text d}t \, e^{-(t- \mu )^2 /(2\,\sigma^2 )} = 0.025 \quad\mbox{and} \quad \frac{1}{\sigma\,\sqrt{2\pi}} \, \int_{-\infty}^{11.809829} {\text d}t \, e^{-(t- \mu )^2 /(2\,\sigma^2 )} = 0.975 . \] These numbers could be obtained from the standard normal distribution: \[ x = z\,\sigma + \mu \qquad \Longleftrightarrow \qquad z = \frac{x-\mu}{\sigma} . \] Therefore, we get \[ 11.809829 = 1.959964 * 4.75 + 2.5 \quad\mbox{and} \quad -6.809829 = -1.959964 * 4.75 + 2.5. \]

Note that we assume that the standard deviation σ of the the total population is known. The above z-interval procedure works reasonably well even when the variable is not normally distributed and the sample size is small or moderate, provided the variable is not too far from being normally distributed. Thus, we say that the z-interval procedure is robust to moderate violations of the normality assumptions.

Example: Consider weights of hockey players in NHL during 2017-2018 season, which has mean 173.5 lbs with standard deviation of 13.39 (according to the official NHL web data). Now we take a sample of five players from Washington Capital:

Player	Weight
Alexanter Ovechkin	236
Nicklas Backstrom	214
Jay Beagle	216
Brooks Orpik	220
Dmitry Orlov	209

We calculate the mean and the variance of the sample according to the formulas: \[ \overline{x} = \frac{236 + 214 + 216 + 220 + 209}{5} = 219, \qquad s^2 = \frac{1}{4} \, \sum_{k=1}^5 |X_k - \overline{x} |^2 = 106 . \] We check with R:

Therefore, this sample shows the mean of 219 with standard deviation of 10.29563. We know that the sample mean $ \overline{x} = 219 $ and its variance s² are unbiased estimators of the population mean μ = 173.5 and the population variance $ \sigma^2 = 13.39^2 \approx 178.2921 . $ However, the sample standard deviation s is a biased estimator of the statistic parameter (in our case, standard deviation of the population).

Now we take another sample from Boston Bruins:

Player	Weight
Brad Marchand	181
Patrice Bergeron	195
David Pastrňák	188
Torey Krug	186
Brandon Carlo	208

This sample of five players gives the mean 191.6 and standard deviation 10.45466.

You can find the confidence interval using R. However, you need first to install two packages (the later one will be used for proportions).

install.packages("Rmisc", lib= "/data/Rpackages/")
install.packages("lattice", lib= "/data/Rpackages/")
install(plyr)
install.packages("PropCIs", lib= "/data/Rpackages/")

So we get the 95% interval for the mean to be [206.2163 , 231.7837], which does not contain the population mean. ■

lizard = c(6.2, 6.6, 7.1, 7.4, 7.6, 7.9, 8, 8.3, 8.4, 8.5, 8.6, + 8.8, 8.8, 9.1, 9.2, 9.4, 9.4, 9.7, 9.9, 10.2, 10.4, 10.8, + 11.3, 11.9) If we use the t.test command listing only the data name, we get a 95% confidence interval for the mean after the significance test. n.draw = 100 mu = 9 n = 24 SD = sd(lizard) draws = matrix(rnorm(n.draw * n, mu, SD), n) get.conf.int = function(x) t.test(x)$conf.int conf.int = apply(draws, 2, get.conf.int) sum(conf.int[1, ] <= mu & conf.int[2, ] >= mu) plot(range(conf.int), c(0, 1 + n.draw), type = "n", xlab = "mean tail length", + ylab = "sample run") for (i in 1:n.draw) lines(conf.int[, i], rep(i, 2), lwd = 2) abline(v = 9, lwd = 2, lty = 2)

Section 1: Interval Estimate