45 Z-test of Sample Proportions

45.1 One-Sample Z-test

The \(z\)-test of proportions is one approach used to look for evidence that the proportion of a sample may differ from a hypothesized (or previously observed) value. It assumes a normal distribution approximation to a Binomial distribution.

45.1.1 Z-Statistic

The \(z\)-statistic is a standardized measure of the magnitude of difference between a sample’s proportion and some known, non-random constant.

45.1.2 Definitions and Terminology

Let \(p\) be a sample proportion from a sample. Let \(\pi_0\) be a constant. \(z\) is defined: \[z = \frac{p - \pi_0}{\sqrt{\frac{\pi_0 \cdot (1 - \pi_0)}{n}}} \]

45.1.3 Hypotheses

The hypotheses for these test take the forms:

For a two-sided test: \[ \begin{aligned} H_0: \pi &= \pi_0\\ H_a: \pi &\neq \pi_0 \end{aligned} \]

For a one-sided test: \[ \begin{aligned} H_0: \pi &< \pi_0\\ H_a: \pi &\geq \pi_0 \end{aligned} \]

or \[ \begin{aligned} H_0: \pi &> \pi_0\\ H_a: \pi &\leq \pi_0 \end{aligned} \]

To compare a sample \((X_1, \ldots, X_n)\) against the hypothesized value, a Z-statistic is calculated in the form:

\[Z = \frac{p - \pi_0}{\sqrt{\frac{\pi_0 \cdot (1 - \pi_0)}{n}}}\]

Where \(p\) is the sample proportion.

45.1.4 Decision Rule

The decision to reject a null hypothesis is made when an observed Z-value lies in a critical region that suggests the probability of that observation is low. We define the critical region as the upper bound we are willing to accept for \(\alpha\), the Type I Error.

In the two-sided test, \(\alpha\) is shared equally in both tails. The rejection regions for the most common values of \(\alpha\) are depicted in the figure below, with the sum of shaded areas on both sides equaling the corresponding \(\alpha\). It follows, then, that the decision rule is:

Reject \(H_0\) when \(Z \leq z_{\alpha/2}\) or when \(Z \geq z_{1-\alpha/2}\).

By taking advantage of the symmetry of the Z-distribution, we can simplify the decision rule to:

Reject \(H_0\) when \(|Z| \geq z_{1-\alpha/2}\)

## Warning: Ignoring unknown aesthetics: ymax

## Warning: Ignoring unknown aesthetics: ymax
Rejection regions for the Z-test of proportions

Figure 45.1: Rejection regions for the Z-test of proportions

In the one-sided test, \(\alpha\) is placed in only one tail. The rejection regions for the most common values of \(\alpha\) are depicted in the figure below. In each case, \(\alpha\) is the area in the tail of the figure. It follows, then, that the decision rule for a lower tailed test is:

Reject \(H_0\) when \(Z \leq z_{\alpha, \nu}\).

For an upper tailed test, the decision rule is:

Reject \(H_0\) when \(Z \geq z_{1-\alpha, \nu}\).

Using the symmetry of the Z-distribution, we can simplify the decision rule as:

Reject \(H_0\) when \(|Z| \geq z_{1-\alpha}\).

## Warning: Ignoring unknown aesthetics: ymax

## Warning: Ignoring unknown aesthetics: ymax
Rejection regions for one-tailed Z-test

Figure 45.2: Rejection regions for one-tailed Z-test

The decision rule can also be written in terms of \(p\):

Reject \(H_0\) when \(p \leq \pi_0 - z_\alpha \cdot \sqrt{\frac{\pi_0 \cdot (1 - \pi_0)}{n}}\) or \(p \geq \pi_0 + z_\alpha \cdot \sqrt{\frac{\pi_0 \cdot (1 - \pi_0)}{n}}\).

This change can be justified by:

\[ \begin{aligned} |Z| &\geq z_{1-\alpha}\\ \Big|\frac{p - \pi_0}{\sqrt{\frac{\pi_0 \cdot (1 - \pi_0)}{n}}}\Big| &\geq z_{1-\alpha} \end{aligned} \]

\[ \begin{aligned} -\Big(\frac{p - \pi_0}{\sqrt{\frac{\pi_0 \cdot (1 - \pi_0)}{n}}}\Big) &\geq z_{1-\alpha} & \frac{p - \pi_0}{\sqrt{\frac{\pi_0 \cdot (1 - \pi_0)}{n}}} &\geq z_{1-\alpha}\\ p - \pi_0 &\leq - z_{1-\alpha} \cdot \sqrt{\frac{\pi_0 \cdot (1 - \pi_0)}{n}} & p - \pi_0 &\geq z_{1-\alpha} \cdot \sqrt{\frac{\pi_0 \cdot (1 - \pi_0)}{n}}\\ p &\leq \pi_0 - z_{1-\alpha} \cdot \sqrt{\frac{\pi_0 \cdot (1 - \pi_0)}{n}} & p &\geq \pi_0 + z_{1-\alpha} \cdot \sqrt{\frac{\pi_0 \cdot (1 - \pi_0)}{n}} \end{aligned} \]

For a two-sided test, both the conditions apply. The left side condition is used for a left-tailed test, and the right side condition for a right-tailed test.

45.1.5 Power

The derivations below make use of the following symbols:

  • \(p\): The sample proportion
  • \(n\): The sample size
  • \(\pi_0\): The value of population mean under the null hypothesis
  • \(\pi_a\): The value of the population mean under the alternative hypothesis.
  • \(\alpha\): The significance level
  • \(\gamma(\mu)\): The power of the test for the parameter \(\mu\).
  • \(z_{\alpha}\): A quantile of the Standard Normal distribution for a probability, \(\alpha\).
  • \(Z\): A calculated value to be compared against a Standard Normal distribution.
  • \(C\): The critical region (rejection region) of the test.

Two-Sided Test

\[ \begin{aligned} \gamma(\pi_a) &= P_{\pi_a}(p \in C)\\ &= P_\mu\Big(p \leq \pi_0 - z_{\alpha/2} \cdot \sqrt{\frac{\pi_0 \cdot (1 - \pi_0)}{n}}\Big) + P_{\pi_a}\Big(p \geq \pi_0 + z_{1-\alpha/2} \cdot \sqrt{\frac{\pi_0 \cdot (1 - \pi_0)}{n}}\Big)\\ &= P_{\pi_a}\Big(p - \pi_a \leq \pi_0 - \pi_a - z_{\alpha/2} \cdot \sqrt{\frac{\pi_0 \cdot (1 - \pi_0)}{n}}\Big) + \\ & \ \ \ \ \ P_{\pi_a}\Big(p - \pi_a \geq \pi_0 - \pi_a + z_{1-\alpha/2} \cdot \sqrt{\frac{\pi_0 \cdot (1 - \pi_0)}{n}}\Big)\\ &= P_{\pi_a}\Big(\frac{p - \pi_a}{\sqrt{\frac{\pi_a \cdot (1 - \pi_a)}{n}}} \leq \frac{\pi_0 - \pi_a - z_{\alpha/2} \cdot \sqrt{\frac{\pi_0 \cdot (1 - \pi_0)}{n}}}{\sqrt{\frac{\pi_a \cdot (1 - \pi_a)}{n}}}\Big) +\\ & \ \ \ \ \ P_{\pi_a}\Big(\frac{p - \mu}{\sqrt{\frac{\pi_a \cdot (1 - \pi_a)}{n}}} \geq \frac{\pi_0 - \pi_a + z_{1-\alpha/2} \cdot \sqrt{\frac{\pi_0 \cdot (1 - \pi_0)}{n}}}{\sqrt{\frac{\pi_a \cdot (1 - \pi_a)}{n}}}\Big)\\ &= P_{\pi_a}\Big(Z \leq \frac{\pi_0 - \pi_a}{\sqrt{\frac{\pi_a \cdot (1 - \pi_a)}{n}}} - z_{\alpha/2}\Big) + P_{\pi_a}\Big(Z \geq \frac{\pi_0 - \pi_a}{\sqrt{\frac{\pi_a \cdot (1 - \pi_a)}{n}}} + z_{1-\alpha/2}\Big)\\ &= P_{\pi_a}\Big(Z \leq -z_{\alpha/2} + \frac{\pi_0 - \pi_a}{\sqrt{\frac{\pi_a \cdot (1 - \pi_a)}{n}}}\Big) + P_{\pi_a}\Big(Z \geq z_{1-\alpha/2} + \frac{\pi_0 - \pi_a}{\sqrt{\frac{\pi_a \cdot (1 - \pi_a)}{n}}}\Big)\\ &= P_{\pi_a}\Big(Z \leq -z_{\alpha/2} + \frac{\sqrt{n} \cdot (\pi_0 - \pi_a)}{\sqrt{\pi_a \cdot (1 - \pi_a)}}\Big) + P_{\pi_a}\Big(Z \geq z_{1-\alpha/2} + \frac{\sqrt{n} \cdot (\pi_0 - \pi_a)}{\sqrt{\pi_a \cdot (1 - \pi_a)}}\Big) \end{aligned} \]

Both \(z_{\alpha/2}\) and \(z_{1-\alpha/2}\) have Standard Normal distributions.

One-Sided Test

For convenience, the power for only the upper tailed test is derived here.
Recall that the symmetry of the t-test allows us to use the decision rule: Reject \(H_0\) when \(|Z| \geq z_{1-\alpha}\). Thus, where \(Z\) occurs in the derivation below, it may reasonably be replaced with \(|Z|\).

\[ \begin{aligned} \gamma(\pi_a) &= P_{\pi_a}(p \in C)\\ &= P_{\pi_a}\big(p \geq \pi_0 + z_{1-\alpha} \cdot \sqrt{\frac{\pi_0 \cdot (1 - \pi_0)}{n}}\big)\\ &= P_{\pi_a}\big(p - \pi_a \geq \pi_0 - \pi_a + z_{1-\alpha} \cdot \sqrt{\frac{\pi_0 \cdot (1 - \pi_0)}{n}}\big)\\ &= P_{\pi_a}\Big(\frac{p - \pi_a}{\sqrt{\frac{\pi_a \cdot (1 - \pi_a)}{n}}} \geq \frac{\pi_0 - \pi_a + z_{1-\alpha} \cdot \sqrt{\frac{\pi_0 \cdot (1 - \pi_0)}{n}}}{\sqrt{\frac{\pi_a \cdot (1 - \pi_a)}{n}}}\Big)\\ &= P_{\pi_a}\Big(Z \geq \frac{\pi_0 - \pi_a}{\sqrt{\frac{\pi_a \cdot (1 - \pi_a)}{n}}} + z_{1-\alpha} \Big)\\ &= P_{\pi_a}\Big(Z \geq z_{1-\alpha} + \frac{\pi_0 - \pi_a}{\sqrt{\frac{\pi_a \cdot (1 - \pi_a)}{n}}}\Big)\\ &= P_{\pi_a}\Big(Z \geq z_{1-\alpha} + \frac{\sqrt{n} \cdot (\pi_0 -\pi_a)}{\sqrt{\frac{\pi_a \cdot (1 - \pi_a)}{n}}}\Big) \end{aligned} \]

Where \(z_{1-\alpha}\) has a Standard Normal distribution.

45.1.6 Confidence Interval

The confidence interval for \(\theta\) is written: \[p \pm z_{1-\alpha/2} \cdot \sqrt{\frac{\pi_0 \cdot (1 - \pi_0)}{n}}\]

The value of the expression on the right is often referred to as the margin of error, and we will refer to this value as \[E = z_{1-\alpha/2} \cdot \sqrt{\frac{\pi_0 \cdot (1 - \pi_0)}{n}}\]

45.2 References

  1. Wackerly, Mendenhall, Scheaffer, Mathematical Statistics with Applications, 6th ed., Duxbury, 2002, ISBN 0-534-37741-6.
  2. Daniel, Biostatistics, 8th ed., John Wiley & Sons, Inc., 2005, ISBN: 0-471-45654-3.
  3. Hogg, McKean, Craig, Introduction to Mathematical Statistics, 6th ed., Pearson, 2005, ISBN: 0-13-008507-3