# 22 Hypergeometric Distribution

## 22.1 Probability Mass Function

A random variable $$X$$ is said to follow a Hypergeometric Distribution if its probability mass function is

$p(x) = \left\{ \begin{array}{ll} \frac{{r\choose x}{N-r\choose n-x}}{{N\choose n}}, & x=0,1,2,\ldots\\ 0 & otherwise \end{array} \right.$

where

• $$N$$ is the number of objects available to choose from
• $$n$$ is the number of objects chosen from $$N$$
• $$r$$ is the number of objects in $$N$$ that posses a desired characteristic (successes)
• $$x$$ is the number of objects in $$n$$ that possess the desired characterstic

## 22.2 Cumulative Mass Function

$P(x) = \left\{ \begin{array}{ll} \sum\limits_{i=0}^{x}\frac{{r\choose i}{N-r\choose n-i}}{{N\choose n}}, & x=0,1,2,\ldots\\ 0 & otherwise \end{array} \right.$

## 22.3 Expected Values

\begin{aligned} E(X) &= \sum\limits_{x=0}^{n}x\frac{{r\choose x}{N-r\choose n-x}}{{N\choose n}} \\ &= \sum\limits_{x=0}^{n}x{r\choose x}\frac{{N-r\choose n-x}}{{N\choose n}} \\ ^{[1]} &= \sum\limits_{x=0}^{n}x\frac{r}{x}{r-1\choose x-1}\frac{{N-r\choose n-x}} {\frac{N}{n}{N-1\choose n-1}} \\ &= \frac{rn}{N}\sum\limits_{x=0}^{n} \frac{{r-1\choose x-1}{N-r\choose n-x}}{{N-1\choose n-1}} \\ ^{[2]} &= \frac{rn}{N}\sum\limits_{y=0}^{n-1} \frac{{r-1\choose y}{N-r\choose n-y-1}}{{N-1\choose n-1}} \frac{rn}{N}\sum\limits_{y=0}^{n-1} \frac{{r-1\choose y}{N-r\choose n-y-1}}{{N-1\choose n-1}} \\ &= \frac{\frac{rn}{N}\cdot n}{N}r\sum\limits_{y=0}^{n-1} \frac{{r-1\choose y}{N-r\choose n-y-1}}{{N-1\choose n-1}} \\ ^{[3]} &= \frac{rn}{N}\cdot 1 \\ &= \frac{rn}{N} \end{aligned}

1. For any integer $$a$$ such that $$0\leq a\leq k,\ {n\choose k}=\frac{n(n-1)\cdots(n-a+1)}{k(k-1)\cdots(k-a+1)}{n-a\choose k-a}$$ (Theorem 10.0.3).
2. Let $$y=x-1$$  $$\Rightarrow x=y+1$$.
3. $$\frac{\sum\limits_{i=1}^{n}{N_1\choose i}{N_2\choose n-i}}{{N_1+N_2\choose n}}=1$$\ with $$N_1=r,\ N_2=N-r,\ i=x$$. (Theorem 6.3.2

\begin{aligned} E[X(X-1)] &= \sum\limits_{x=0}^{n}x(x-1)\frac{{r\choose x}{N-r\choose n-x}}{{N\choose n}} \\ ^{[1]} &= \sum\limits_{x=0}^{n}\frac{x(x-1)r(r-1)}{x(x-1)}\frac{{r-2\choose x-2} {N-r\choose n-x}}{\frac{N(N-1)}{n(n-1)}{N-2\choose n-2}} \\ &= \frac{r(r-1)n(n-1)}{N(N-1)}\sum\limits_{x=0}^{n} \frac{{r-2\choose x-2}{N-r\choose n-x}}{{N-2\choose n-2}} \\ ^{[2]} &= \frac{r(r-1)n(n-1)}{N(N-1)}\sum\limits_{y=0}^{n-2} \frac{{r-2\choose y}{N-r\choose n-y-2}}{{N-2\choose n-2}} \\ ^{[3]} &= \frac{r(r-1)n(n-1)}{N(N-1)}\cdot 1 \\ &=\frac{r(r-1)n(n-1)}{N(N-1)} \end{aligned}

1. For any integer $$a$$ such that $$0\leq a\leq k,\ {n\choose k}=\frac{n(n-1)\cdots(n-a+1)}{k(k-1)\cdots(k-a+1)}{n-a\choose k-a}$$ (Theorem 10.0.3).
2. Let $$y=x-1$$  $$\Rightarrow x=y+1$$.
3. $$\frac{\sum\limits_{i=1}^{n}{N_1\choose i}{N_2\choose n-i}}{{N_1+N_2\choose n}}=1$$\ with $$N_1=r,\ N_2=N-r,\ i=x$$. (Theorem 6.3.2

\begin{aligned} \mu &= E(X) \\ &= \frac{rn}{N} \\ \\ \\ \sigma^2 &= E(X^2) - E(X)^2 \\ &= E(X^2) - E(X) + E(X) - E(X)^2 \\ &= (E(X^2) - E(X) + E(X) - E(X)^2 \\ &= E(X^2-X) + E(X) - E(X)^2 \\ &= E[X(X-1)] + E(X) - E(X)^2\\ &= \frac{r(r-1)n(n-1)}{N(N-1)} + \frac{rn}{N} - \frac{r^2n^2}{N^2} \\ &= \frac{r(r-1)n(n-1)N}{N^2(N-1)} + \frac{rnN(N-1)}{N^2(N-1)} - \frac{r^2n^2(N-1)}{N^2(N-1)} \\ &= \frac{(r^2-r)(n^2-n)N rn(N^2-N)-r^2n^2(N-1)}{N^2(N-1)} \\ &= \frac{(r^2n^2N-r^2n^2N-rn^2N+rnN+rnN^2-rnN-r^2n^2N+r^2n^2}{N^2(N-1)} \\ &= \frac{-r^2nN-rn^2N+rnN^2+r^2n^2}{N^2(N-1)} \\ &= \frac{nr(-rN-nN+N^2+rn}{N^2(N-1)} \\ &= \frac{nr(N^2-nN-rN+rn}{N^2(N-1)} \\ &= \frac{nr(N-r)(N-n)}{N^2(N-1)} \\ &= \frac{nr(N-r)(N-n)}{N\cdot N(N-1)} \\ &= \frac{nr}{N}\cdot\frac{N-r}{N}\cdot\frac{N-n}{N-1} \end{aligned}

## 22.4 Moment Generating Function

\begin{aligned} M_X(t) &= E(e^{tX}) \\ &= \sum\limits_{x=0}^{n}e^{tx}\frac{{r\choose x}{N-r\choose n-x}}{{N\choose n}} \\ &= \frac{1}{{N\choose n}}\sum\limits_{x=0}^{n}e^{tx}{r\choose x}{N-r\choose n-x} \\ &= \frac{1}{{N\choose n}}[e^{0t}{r\choose 0}{N-r\choose n-0} + e^{1t}{r\choose 1}{N-r\choose n-1} + e^{2t}{r\choose 2}{N-r\choose n-2} + \cdots + e^{nt}{r\choose n}{N-r\choose n-n}] \\ &= \frac{1}{{N\choose n}}[{N-r\choose n-0}+e^{t}{r\choose 1}{N-r\choose n-1} + e^{2t}{r\choose 2}{N-r\choose n-2}+\cdots+e^{nt}{r\choose n}{N-r\choose n-n}] \end{aligned}

This mgf does not reduce to any form which can be differentiated, and we cannot use it to generate moments for the distribution.

## 22.5 Theorems for the Hypergeometric Distribution

### 22.5.1 Validity of the Distribution

$\sum\limits_{x=0}^{n}\frac{{r\choose x}{N-r\choose n-x}}{{N\choose n}} = 1$

Proof:

Theorem 6.3.1 states

${N_1\choose 0}{N_2\choose n}+{N_1\choose 2}{N_2\choose n-1}+\cdots +{N_1\choose n-1}{N_2\choose 1}+{N_1\choose n}{N_2\choose 0} \\ = \sum\limits_{x=0}^{n}{N_1\choose x}{N_2\choose n-x} \\ = {N_1+N_2\choose n}$

Using $$N_1 = r$$ and $$N_2 = N-r$$ we have \begin{aligned} \sum\limits_{x=0}^{n}\frac{{r\choose x}{N-r\choose n-x}}{{N\choose n}} &= \frac{{r+N-r\choose n}}{{N\choose n}} \\ &= \frac{{N\choose n}}{{N\choose n}} \\ &= 1 \end{aligned}