Enter the Z-Score, proportion, and margin of error into the calculator to determine the sample size using Cochran’s formula.
Related Calculators
- Inverse Z Score Calculator
- Relative Accuracy Calculator
- Content Validity Ratio Calculator
- Percentage Accuracy Calculator
- All Statistics Calculators
Cochran’s Sample Size Formula
Cochran’s formula calculates the minimum number of responses needed to estimate a population proportion at a chosen confidence level and margin of error:
n_0 = \frac{Z^2 \cdot p \cdot (1 - p)}{e^2}Variables:
- n0 = required sample size (for an infinite population)
- Z = Z-score corresponding to the desired confidence level
- p = estimated proportion of the attribute in the population (use 0.5 if unknown)
- e = margin of error (desired precision)
Z-Score Reference by Confidence Level
The Z-score is drawn from the standard normal distribution and maps directly to the probability that the true population parameter falls within your margin of error. The most commonly used values in survey research are:
| Confidence Level | Z-Score | Meaning |
|---|---|---|
| 80% | 1.282 | 1 in 5 chance the true value lies outside the interval |
| 85% | 1.440 | 1 in ~7 chance |
| 90% | 1.645 | 1 in 10 chance |
| 95% | 1.960 | 1 in 20 chance (most common in research) |
| 99% | 2.576 | 1 in 100 chance |
| 99.9% | 3.291 | 1 in 1,000 chance |
A 95% confidence level (Z = 1.96) is the standard in most academic and government survey work. Medical and pharmaceutical research often requires 99% confidence.
Why p = 0.5 Gives the Largest Sample Size
The expression p(1 – p) in the numerator of Cochran’s formula reaches its mathematical maximum when p = 0.5, yielding 0.25. At p = 0.3, the product is 0.21. At p = 0.1, just 0.09. Because 0.5 produces the largest possible sample size for any given Z and e, researchers use it as a conservative default when no prior estimate of the proportion exists. If a pilot study or previous research suggests a proportion far from 0.5, using the actual estimate will reduce the required sample size and lower data collection costs.
Pre-Computed Sample Sizes (Infinite Population)
The table below shows minimum sample sizes calculated using Cochran’s formula with p = 0.5 (maximum variability). All values are rounded up to the nearest whole number.
| Margin of Error | 90% Confidence (Z=1.645) | 95% Confidence (Z=1.96) | 99% Confidence (Z=2.576) |
|---|---|---|---|
| 1% | 6,766 | 9,604 | 16,590 |
| 2% | 1,692 | 2,401 | 4,148 |
| 3% | 752 | 1,068 | 1,844 |
| 5% | 271 | 385 | 664 |
| 7% | 139 | 196 | 339 |
| 10% | 68 | 97 | 167 |
At 95% confidence and 5% margin of error, the required sample is 385 regardless of whether the population is 50,000 or 50 million. This is one of the counterintuitive properties of proportion-based sampling: once the population is large enough, absolute sample size matters far more than the sampling fraction.
Finite Population Correction (FPC)
Cochran’s base formula assumes an infinite (or very large) population. When the population size N is known and the initial sample n0 exceeds roughly 5% of N, the estimate can be tightened using the finite population correction:
n = \frac{n_0}{1 + \frac{n_0 - 1}{N}}Where N is the total population and n0 is the sample size from the base formula. For example, with n0 = 385 and N = 5,000, the adjusted sample is 385 / (1 + 384/5000) = 358. The correction becomes negligible as N grows large relative to n0.
Finite Population Correction Examples
Using a base sample of n0 = 385 (95% confidence, 5% margin, p = 0.5):
| Population (N) | Adjusted Sample (n) | Reduction from n0 |
|---|---|---|
| 500 | 218 | 43% |
| 1,000 | 278 | 28% |
| 2,000 | 323 | 16% |
| 5,000 | 358 | 7% |
| 10,000 | 370 | 4% |
| 50,000 | 382 | 1% |
| 1,000,000+ | 385 | ~0% |
The practical takeaway: for populations under about 5,000 to 10,000, the finite correction saves meaningful resources. Above that range, the adjustment barely changes the required count.
Cochran’s Formula vs. Yamane and Slovin
Cochran’s 1977 formula is the most flexible of the three major sample size approaches used in survey research. Yamane’s formula (1967) is actually a special case of Cochran’s that fixes the confidence level at 95% and the proportion at 0.5. Slovin’s formula, widely cited in social science theses, is mathematically identical to Yamane’s but often applied without checking whether its fixed assumptions hold.
| Feature | Cochran (1977) | Yamane (1967) | Slovin |
|---|---|---|---|
| Formula | n = Zยฒp(1-p)/eยฒ | n = N/(1+Neยฒ) | n = N/(1+Neยฒ) |
| Adjustable confidence level | Yes | No (fixed at 95%) | No (fixed at 95%) |
| Adjustable proportion | Yes | No (fixed at 0.5) | No (fixed at 0.5) |
| Requires known population | No (optional via FPC) | Yes | Yes |
| Best for | Any survey or experiment | Known finite populations | Quick estimates |
| Academic rigor | High | Moderate | Low (often misapplied) |
Cochran’s formula is the preferred choice in peer-reviewed research because it allows the researcher to specify all three parameters independently. Yamane and Slovin produce identical outputs and are best treated as shorthand for Cochran’s formula under specific conditions, not as separate methods.
Assumptions and Limitations
Cochran’s formula rests on several assumptions that must be satisfied for the result to be valid. First, the sampling method must be simple random sampling or a design that can be approximated as such. Stratified, cluster, or multistage designs require a design effect multiplier (DEFF), typically between 1.5 and 2.5, applied to the Cochran sample size to account for reduced efficiency. Second, the formula estimates sample size for a single proportion. Studies measuring means instead of proportions should use the alternative formula n = Zยฒsยฒ/eยฒ, where s is the estimated standard deviation. Third, the formula does not account for non-response. If the expected response rate is 70%, the researcher should inflate the calculated n by dividing by 0.70. Fourth, the formula assumes a single binary outcome. Studies with multiple outcome variables should calculate n for the variable requiring the largest sample and use that as the overall target.
Origin and History
William Gemmell Cochran (1909 to 1980) was a Scottish-born statistician who spent most of his career in the United States. He studied mathematics at the University of Glasgow and Cambridge before joining Rothamsted Experimental Station in 1934, where he worked alongside Ronald Fisher on the foundations of experimental design. After moving to the U.S. in 1939, Cochran held positions at Iowa State, Johns Hopkins, and Harvard. His textbook “Sampling Techniques,” first published in 1953 and revised in its third edition in 1977, remains one of the definitive references on survey sampling methodology. The sample size formula that bears his name appears in Chapter 4 of that text and has become the standard starting point for sample size determination in fields ranging from public health to market research.
Common Applications
Cochran’s formula is used across a wide range of disciplines. In public health, it determines how many individuals to screen when estimating disease prevalence. In market research, it sets the minimum number of consumer surveys needed to gauge brand preference within a stated margin. Political polling organizations use it to establish baseline sample sizes before applying design effects for stratification and clustering. In quality control, it calculates the number of units to inspect when estimating a defect rate. Academic dissertations in education, social sciences, and business administration routinely cite Cochran’s formula when justifying sample size for survey-based studies. The formula also appears in environmental monitoring, agricultural field trials, and election auditing.