Confidence Intervals and Sample Size

Introduction

Sampling is the essence of statistics. Normally a researcher cannot get access to an entire population and must make do with a sample, even if a sample is not exactly the same as the population. There is going to be some difference between the results of a sample and the truth, and if another sample is taken, the results would be different again. Ideally, the results of the sample would be relatively close to the truth. Remember that the real objective in statistics can best be summarized in the definition of inferential statistics – using a sample to make inferences about a population.

Confidence Intervals

When a statistician makes inferences, they are estimating. The value of a population mean is rarely known, but the sample mean usually is known. Considering the sample mean by itself is not accurate and may be too large or too small, an interval is used to estimate. The confidence interval is basically the sample mean + / - a margin of error. The margin of error is computed from the z-score (based on the level of confidence) times the standard deviation divided by the square root of the sample size. The confidence interval is essentially saying that we are X% confident that the true population mean is within the confidence interval.

The z-score corresponds to the standard normal distribution. A 95% confidence interval captures the middle 95% of the normal curve, so the z-score is 1.96 since the area between -1.96 and +1.96 equals 95%. A 90% confidence interval captures a smaller area and has a z-score of 1.645. A 99% confidence interval captures a larger area and has a z-score of 2.576. Thus, the larger the level of confidence, the larger the margin of error.

The rest of the confidence interval formula shows that the larger the sample size, the smaller the margin of error. Additionally, the larger the standard deviation, the larger the margin of error. The standard deviation cannot be controlled (it is what it is), but the level of confidence and the sample size can be affected by input.

To help illustrate confidence intervals, imagine throwing a dart at the bull’s-eye on a dartboard from 12 feet away. If only one throw is available to hit the target, the confidence in that happening would be low. If the target were expanded to include the ring around the bull’s-eye, confidence in hitting the target would grow. If the target were expanded further to include the entire dartboard, confidence would grow more. Essentially, the larger the margin of error, the greater the confidence at hitting the target.

Now aim for the ring around the bull’s-eye again and assume several practice throws. If the target is hit 8 out of 10 times the next dart could be expected to hit a relatively small area. The expected target is probably much smaller than it was prior to the first throw. Essentially, the larger the sample size, the smaller the margin of error.

There are two formulas for computing the confidence interval of the mean. The only difference between the two has to do with one's knowledge of the population standard deviation. If it is known, use a formula that uses the z-table, but if it is unknown, the formula uses a t-table. When using Excel, the two tables are transparent on the computer, but there are two different menu items for the two situations.

Confidence intervals can also be computed for proportions. They are essentially the same in that there is still a confidence level (translated into a z-score) and sample size, but there are no standard deviations. An election poll that reports a margin of error is using a confidence interval. For example, if one candidate is shown to have 52% of the vote with a 3-point margin of error, it means that candidate truly has between 49% and 55% of the vote. The 52% is based on a sample, but the confidence interval implies that if the entire population were polled, the results would be in that range. The certainty that the truth is in that range is the level of confidence. Notice how often the newspapers report the margin of error but not the level of confidence. A range of 49 – 55% means that the candidate might win or might lose, and the race should be labeled as undecided. Making a conclusion based on being close is a mistake. If the confidence interval were 51 – 55%, then it could be stated that the candidate is winning, at X% confidence.

Sample Sizes

Probably the most common question often asked of a statistician is "how large of a sample do I need to take…" To understand the answer to this, it is important to look back a bit. When data is normally distributed, probabilities can be computed if only the mean and standard deviation are known, and life is good. However, what if the population mean is unknown? In that case, the best estimate of a population mean is a sample mean. So take a sample and compute the mean to produce an estimate. How good is that estimate? That is what margins of error are for, based on the level of confidence. But what if the confidence interval is computed and it is just too large (after all, what good is it to estimate that the presidential candidate has 52% of the vote with a margin of error of 20%)? If the confidence interval is too large, a larger sample can be taken to tighten the confidence interval, but how much is enough? It would be nice to know exactly how large a sample is needed to get the confidence interval desired. Thanks to the sample size formula, that is possible.

To compute the minimum sample size, two questions need to be answered: how much error will be tolerated AND how much confidence is wanted in the truth being within that margin of error. The more confident is wanted, the larger the sample must be. The greater the margin of error, the smaller the needed sample. Normally the answer given by those not in the know is to say they want 0% error and 100% confidence, but that would require sampling the entire population. Anything less than the population involves incurring a degree of error, but the question is "how much is acceptable?" Once a margin of error and confidence level are decided on and the sample size is computed But what if this sample is 2,000 people? In a phone survey, up to 20,000 people may have to be called just to get 2,000 responses, but if the sample size is cut, the confidence is cut and the margin of error increases. It is truly a balance game.

Here is another interesting sampling misconception. Would a sample of 400 people from a neighborhood be a good sample? It might, but what about 400 people from a city? What about 400 people from a state or even 400 people from the United States.? Unbelievably, they are equally good. The size of the population has nothing to do with the quality of a sample. A sample of 400 is just as good if it comes from a population of 100,000 or 100,000,000. The only issue is that the sample must be representative (i.e., do not sample 400 from a city and use it to draw conclusions about the U.S.A.). The sample must be randomly chosen too.

To help illustrate, imagine if making chili for a family and the mother-in-law likes it mild while everyone else likes it spicy, so one quart of mild chili and 2 gallons of spicy chili are made. When it is time for a taste test, a spoonful of the mild chili is tasted and declared perfect. How many spoonfuls of the larger pot do need to be tasted? Since it is 8 times larger, does that mean 8 spoonfuls? Of course not; one spoonful of the larger pot reveals just as much as one spoonful of the smaller pot, providing the pots are stirred first. The idea is that the spoonful is representative of the pot and has a representative sample of molecules of chili from the population known as the chili pot. Therefore, the ideal sample size is not a percentage of the population; population size only comes into play when the population is small, in which case the sample size can be smaller.

Conclusion

Surveys are often misinterpreted. Knowledge of confidence intervals and sample sizes (what a margin of error truly means, why the error is preferred to be small, and how larger sample sizes translate to higher levels of confidence with smaller margins of error) represent an insiders knowledge of survey interpretation and will help to minimize those misinterpretations.