Confidence interval for a proportion

The confidence interval for a proportion estimates the range of plausible values for the true population proportion \(p\). The standard Wald interval works well for large samples with \(p\) not too extreme, but the Wilson score interval is more reliable in general.

The Wald interval

The most common formula for a \((1-\alpha)\) CI for a proportion is the Wald interval:

\[\hat{p} \pm z_{\alpha/2} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\]

where \(\hat{p} = x/n\) is the sample proportion and \(z_{\alpha/2}\) is the standard normal critical value (1.96 for 95%).

The margin of error is \(z_{\alpha/2}\sqrt{\hat{p}(1-\hat{p})/n}\).

Sampling distribution of the sample proportion with the 95% confidence interval highlighted

The Wilson score interval

The Wilson score interval performs better than the Wald interval when \(n\) is small or \(p\) is close to 0 or 1. Its formula is:

\[\frac{\hat{p} + \frac{z^2}{2n} \pm z\sqrt{\frac{\hat{p}(1-\hat{p})}{n} + \frac{z^2}{4n^2}}}{1 + \frac{z^2}{n}}\]

where \(z = z_{\alpha/2}\). For 95% confidence, \(z = 1.96\).

⚠️ The Wald interval can fail badly for small n or extreme p

The Wald interval requires \(n\hat{p} \geq 10\) and \(n(1-\hat{p}) \geq 10\) for the normal approximation to be adequate. When these conditions are violated:

  • For \(\hat{p} = 0\) or \(\hat{p} = 1\), the Wald interval collapses to a single point with zero width, which is clearly wrong.
  • For small \(n\) and \(p\) near 0 or 1, the actual coverage probability can be much lower than the nominal 95%.

The Wilson score interval avoids these problems and is recommended as the default in most applied settings. For very small \(n\), the exact Clopper-Pearson interval (based on the binomial distribution) is the most conservative option.

Comparison of Wald and Wilson score intervals for different values of p and n=20

Step-by-step examples

Example 1: defect rate in manufacturing

A factory inspects 150 components and finds 12 defective. Construct a 95% CI for the defect rate.

\[\hat{p} = \frac{12}{150} = 0.080, \qquad n = 150\]

Check: \(n\hat{p} = 12 \geq 10\) and \(n(1-\hat{p}) = 138 \geq 10\). Wald interval is adequate.

\[\text{SE} = \sqrt{\frac{0.080 \times 0.920}{150}} = \sqrt{0.000491} \approx 0.0221\]

\[\text{CI} = 0.080 \pm 1.96 \times 0.0221 = 0.080 \pm 0.043 = (0.037,\; 0.123)\]

The defect rate is estimated at 8.0% (95% CI: 3.7% to 12.3%).

Example 2: clinical trial response rate

In a trial with 80 patients, 28 respond to treatment. Construct a 95% CI using both methods.

\[\hat{p} = 28/80 = 0.350, \qquad n = 80\]

Wald interval:

\[\text{SE} = \sqrt{\frac{0.35 \times 0.65}{80}} \approx 0.0533\]

\[\text{CI}_{\text{Wald}} = 0.350 \pm 1.96 \times 0.0533 = (0.245,\; 0.455)\]

Wilson score interval:

\[\text{center} = \frac{0.350 + 1.96^2/160}{1 + 1.96^2/80} = \frac{0.374}{1.048} \approx 0.357\]

\[\text{CI}_{\text{Wilson}} \approx (0.251,\; 0.463)\]

For this moderate \(n\) and \(p\), the two intervals are close. Wilson is slightly wider and shifted toward 0.5.

Sample size planning

To achieve a margin of error of at most \(d\) with confidence \(1-\alpha\), solve for \(n\):

\[n \geq \frac{z_{\alpha/2}^2\, \hat{p}(1-\hat{p})}{d^2}\]

If no prior estimate of \(p\) is available, use \(\hat{p} = 0.5\), which maximizes \(\hat{p}(1-\hat{p}) = 0.25\) and gives the most conservative (largest) sample size:

\[n \geq \frac{z_{\alpha/2}^2 \times 0.25}{d^2}\]

Sample size calculation

A polling organization wants to estimate a voting intention proportion with a margin of error of \(\pm 3\%\) at 95% confidence. No prior estimate of \(p\) is available.

\[n \geq \frac{1.96^2 \times 0.25}{0.03^2} = \frac{3.8416 \times 0.25}{0.0009} = \frac{0.9604}{0.0009} \approx 1{,}068\]

A sample of at least 1,068 people is needed. If a prior poll suggested \(p \approx 0.35\):

\[n \geq \frac{1.96^2 \times 0.35 \times 0.65}{0.03^2} = \frac{3.8416 \times 0.2275}{0.0009} \approx 971\]

Using the prior estimate saves about 100 interviews.

Example icon

💡 Practical guidelines for proportion CIs

  • Use the Wald interval when \(n\hat{p} \geq 10\) and \(n(1-\hat{p}) \geq 10\).
  • Use the Wilson score interval as the default: it works in all cases and equals the Wald for large \(n\).
  • Use the exact Clopper-Pearson interval for very small \(n\) or when \(\hat{p} = 0\) or \(\hat{p} = 1\).
  • For sample size planning with no prior estimate, use \(p = 0.5\).
  • Report as: “\(\hat{p} = 0.35\) (95% CI: 0.25 to 0.46, \(n = 80\))”.