4 Binomial Distribution

4.1 Probability Mass Function

A random variable is said to follow a Binomial distribution with parameters \(n\) and \(\pi\) if its probability mass function is:

\[\Pr(X = x)= \left\{ \begin{array}{ll} {n \choose x} \pi^x (1-\pi)^{n-x}, & x=0,1,2,\ldots,n\\ 0 & \mathrm{otherwise} \end{array} \right. \]

Where \(n\) is the number of trials performed and \(\pi\) is the probability of a success on each individual trial.

4.2 Cumulative Mass Function

\[ \Pr(X \leq x)= \left\{ \begin{array} {lll} 0 & x<0\\ \sum\limits_{i=0}^{x} {n \choose i} \pi^i (1-\pi)^{n-i} & 0 \leq x=0,1,2,\ldots,n\\ 1 & n\leq x \end{array} \right. \]

A recursive form of the cdf can be derived and has some usefulness in computer applications. With it, one need only initiate the first value and additional cumulative probabilities can be calculated. It is derived as follows:

\[\begin{aligned} F(x+1) &= {n\choose x+1} \pi^{x+1} (1-\pi)^{n-(x+1)} \\ &= \frac{n!}{(x+1)!(n-(x+1))!} \pi^{x+1} (1-\pi)^{n-(x+1)} \\ &= \frac{n!}{(x+1)!(n-x-1)!} \pi^{x+1} (1-\pi)^{n-x-1} \\ &= \frac{(n-x)n!}{(x+1)x!(n-x)(n-x-1)!} \pi \cdot \pi^x \frac{(1-\pi)^{n-x}}{(1-\pi)} \\ &= \frac{(n-x)n!}{(x+1)x!(n-x)!} \cdot \frac{\pi}{1-\pi} \pi^x (1-\pi)^{n-x} \\ &= \frac{\pi}{1-\pi} \cdot \frac{n-x}{x+1} \cdot \frac{n!}{x!(n-x)!} \pi^x (1-\pi)^{n-x} \\ &= \frac{\pi}{1-\pi} \cdot \frac{n-x}{x+1} \cdot {n\choose x} \pi^x (1-\pi)^{n-x} \\ &= \frac{\pi}{1-\pi} \cdot \frac{n-x}{x+1} \cdot F(x) \end{aligned}\]

The graphs on the left and right show a Binomial Probability Distribution and Cumulative Distribution Function, respectively, with $n=10$ and $\pi=.4$.

(#fig:Binomial_Distribution)The graphs on the left and right show a Binomial Probability Distribution and Cumulative Distribution Function, respectively, with \(n=10\) and \(\pi=.4\).

4.3 Expected Values

\[\begin{aligned} E(X) &= \sum\limits_{x=0}^n x \cdot \Pr(X = x) \\ &= \sum\limits_{x=0}^n x {n\choose x} \pi^x (1-\pi)^{n-x} \\ ^{[1]} &= \sum\limits_{x=0}^n x {n\choose x} \pi^x q^{n-x} \\ &= 0 \cdot {n\choose 0}\pi^0q^n+1 \cdot {n\choose 1}\pi^1q^{n-1} + \cdots + n{n\choose n}\pi^nq^{n-n}\\ &= 0 + 1{n\choose 1}\pi^1q^{n-1} + 2{n\choose 2}\pi^2q^{n-2} + \cdots + n{n\choose n}\pi^nq^{n-n}\\ &= n\pi^1 q^{n-1} + n(n-1)\pi^2q^{n-2} + \cdots + n(n-1)\pi^{n-1}q^{n-(n-1)} + n \pi^n\\ &= n\pi [q^{n-1} + (n-1)\pi q^{n-2} + \cdots + \pi^{n-1}]\\ &= n\pi \Big[{n-1\choose 0}\pi^0q^{n-1} + {n-1\choose 1}\pi^1q^{(n-1)-1} + \cdots + {n-1\choose n-1}\pi^{n-1}q^{(n-1)-(n-1)}\Big]\\ &= n\pi (\sum\limits_{x=0}^{n-1}{n-1\choose x}\pi^xq^{(n-1)-x}) \\ ^{[2]} &= n\pi(\pi+q)^{n-1} \\ ^{[1]} &= n\pi(\pi+(1-\pi))^{n-1} \\ &= n\pi(\pi+1-\pi)^{n-1} \\ &= n\pi(1)^{n-1} \\ &= n\pi(1) \\ &= n\pi \end{aligned}\]

  1. Let \(q = (1 - \pi)\)
  2. By the Binomial Theorem (6.1.2), \(\sum\limits_{x=0}^n{n\choose x}a^xb^{n-x}=(a+b)^n\)

\[\begin{aligned} E(X^2) &= \sum\limits_{x=0}^{n} x^2 \Pr(X = x) \\ &= \sum\limits_{x=0}^{n} x^2 {n\choose x} \pi^x (1-\pi)^{n-x} \\ ^{[1]} &= \sum\limits_{x=0}^{n} x^2 {n\choose x} \pi^x q^{n-x} \\ &= 0^2 \frac{n!}{0!(n-0)!} \pi^0q^n + 1^2 \frac{n!}{1!(n-1)!} \pi^1q^{n-1} + \cdots + n^2 \frac{n!}{n!(n-n)!} \pi^nq^{n-n} \\ &= 0 + 1 \frac{n!}{(n-1)!} \pi q^{n-1} + 2 \frac{n!}{1\cdot(n-2)!} \pi^2q^{n-2} + \cdots + n \frac{n!}{(n-1)!(n-n)!} \pi^n \\ &= n\pi \Big[1 \frac{(n-1)!}{(n-1)!} \pi^0q^{n-1} + 2 \frac{(n-1)!}{1(n-2)!} \pi^2q^{n-2} + \cdots + n \frac{(n-1)!}{(n-1)!(n-n)!} \pi^{n-1}\Big] \\ &= n\pi \Big[1 \frac{(n-1)!}{(1-1)!((n-1)-(-1-1))!} \pi^{1-1} q^{n-1} + \cdots + n \frac{(n-1)!}{(n-1)!((n-1)-(n-1))!} \pi^{n-1} q^{(n-1)-(n-1)}\Big] \\ &= n\pi \sum\limits_{x=1}^{n} x {n-1\choose x-1} \pi^{x-1}1^{(n-1)-(x-1)} \\ ^{[2]} &= \sum\limits_{y=0}^{m} (y+1) {m \choose y} \pi^yq^{m-y} \\ &= n\pi \Big[ \sum\limits_{y=0}^{m} y {m \choose y} \pi^yq^{m-y} + {m \choose y} \pi^yq^{m-y}\Big] \\ &= n\pi \Big[ \sum\limits_{y=0}^{m} y {m \choose y} \pi^yq^{m-y} + \sum\limits_{y=0}^{m} {m \choose y} \pi^yq^{m-y}\Big] \\ ^{[3]} &= n\pi(m\pi+1) \\ &= n\pi[(n-1)\pi+1] \\ &=n\pi(n\pi-\pi+1) \\ &=n^2\pi^2 - n\pi^2 + n\pi \end{aligned}\]

  1. \(q = (1 - \pi)\)
  2. Let \(y = x - 1\) and \(n = m + 1\)
    \(\Rightarrow\) \(x = y + 1\) and \(m = n - 1\)
  3. \(\sum\limits_{y=0}^{m}y{m \choose y}\pi^yq^{m-y}\) is of the form of the expected value of \(Y\), and \(E(Y)=m\pi=(n-1)\pi\). \(\sum\limits_{y=0}^{m}{m \choose y}\pi^yq^{m-y}\) is the sum of all probabilities over the domain of \(Y\) which is 1.

\[\begin{aligned} \mu &= E(X) \\ &= n\pi \\ \\ \\ \sigma^2 &= E(X^2) - E(X)^2 \\ &= n^2\pi^2 - n\pi^2 + n\pi - n^2\pi^2 \\ &= -n\pi^2 + n\pi \\ &= n\pi(-\pi-1) \\ &= n\pi(1-\pi) \end{aligned}\]

4.4 Moment Generating Function

\[ \begin{aligned} M_X(t) &= E(e^{tX})=\sum\limits_{x=0}^{n}e^{tx}p(x) \\ &= \sum\limits_{x=0}^{n}e^{tx}{n\choose x}\pi^x(1-\pi)^{n-x} \\ &= \sum\limits_{x=0}^{n}{n\choose x}e^{tx}\pi^x(1-\pi)^{n-x} \\ &= \sum\limits_{x=0}^{n}{n\choose x}(\pi e^{tx})^x(1-\pi)^{n-x} \\ ^{[1]} &= [(1-\pi)+\pi e^t]^n \end{aligned} \]

  1. By the Binomial Theorem (6.1.2), \(\sum\limits_{x=0}^n{n\choose x}a^xb^{n-x}=(a+b)^n\)

\[ \begin{aligned} M_X^{(1)}(t) &= n[(1 - \pi) + \pi e^t] ^ {n - 1} \pi e^t\\ \\ M_X^{(2)}(t) &= n[(1-\pi) + \pi e^t] ^ {n-1} \pi e^t + n(n-1)[(1-\pi) + \pi e^t] ^ {n-2}(\pi e^t)^2\\ &= n\pi e^t[(1-\pi) + \pi e^t] ^ {n-1} + n(n-1)\pi e^{2t}[(1-\pi) + \pi e^t] ^ {n-2}\\ \\ \\ E(X) &= M_X^{(1)}(0) \\ &= n[(1-\pi)+\pi e^0]^{n-1}\pi e^0 \\ &= n[1-\pi+\pi^{n-1}\pi\\ &= n(1)^{n-1}\pi &= n\pi\\ \\ \\ E(X^2) &= M_X^{(2)}(0) \\ &= n\pi e^0 [(1-\pi) + \pi e^0]^{n-1} + n(n-2) \pi e^{2\cdot0}[(1-\pi) + \pi e^0]^{n-2} \\ &= n\pi(1-\pi+\pi)^{n-2}+n(n-1)\pi^2(1-\pi+\pi^{n-2} \\ &= n\pi (1)^{n-1} + n(n-1) \pi^2 (1)^{n-2} \\ &= n\pi+n(n-1)\pi^2 \\ &= n\pi+(n^2-n)\pi^2 \\ &= n\pi + n^2 + n^2\pi^2 - n\pi^2 \\ \\ \\ \mu &= E(X) \\ &= n\pi \\ \\ \\ \sigma^2 &= E(X^2) - E(X)^2 \\ &= n\pi + n^2\pi^2 - n\pi^2 - n^2\pi^2 \\ &= n\pi - n\pi^2\\ &= n\pi(1-\pi) \end{aligned}\]

4.5 Maximum Likelihood Estimator

Since \(n\) is fixed in each Binomial experiment, and must therefore be given, it is unnecessary to develop an estimator for \(n\). The mean and variance can both be estimated from the single parameter \(\pi\).

Let \(X\) be a Binomial random variable with parameter \(\pi\) and \(n\) outcomes \((x_1,x_2,\ldots,x_n)\). Let \(x_i=0\) for a failure and \(x_i=1\) for a success. In other words, \(X\) is the sum of \(n\) Bernoulli trials with equal probability of success and \(X=\sum\limits_{i=1}^{n}x_i\).

4.5.1 Likelihood Function

\[\begin{aligned} L(\theta) &= L(x_1,x_2,\ldots,x_n|\theta) \\ &= P(x_1|\theta) P(x_2|\theta) \cdots P(x_n|\theta) \\ &= [\theta^{x_1}(1-\theta)^{1-x_1}] [\theta^{x_2}(1-\theta)^{1-x_2}] \cdots [\theta^{x_n}(1-\theta)^{1-x_n}]\\ &= \exp_\theta\bigg\{\sum\limits_{i=1}^{n}x_i\bigg\} \exp_{(1-\theta)}\bigg\{n-\sum\limits_{i=1}^{n}x_i\bigg\} \\ &= \theta^X(1-\theta)^{n-X} \end{aligned}\]

4.5.2 Log-likelihood Function

\[\begin{aligned} \ell(\theta) &= \ln L(\theta) \\ &= \ln\big(\theta^X(1-\theta)^{n-X}\big) \\ &= X\ln(\theta)+(n-X)\ln(1-\theta) \end{aligned}\]

4.5.3 MLE for \(\pi\)

\[\begin{aligned} \frac{d\ell(p)}{d \pi} &= \frac{X}{\pi}-\frac{n-X}{1-\pi} \\ \\ \\ 0 &= \frac{X}{\pi}-\frac{n-X}{1-\pi} \\ \Rightarrow \frac{X}{\pi} &= \frac{n-X}{1-\pi} \\ \Rightarrow (1-\pi)X &= \pi(n-X) \\ \Rightarrow X-\pi X &= n\pi-\pi X \\ \Rightarrow X &= n\pi \\ \Rightarrow \frac{X}{n} &= \pi \\ \end{aligned}\]

So \(\displaystyle \hat p = \frac{X}{n} = \frac{1}{n}\sum\limits_{i=1}^{n}x_i\) is the maximum likelihood estimator for \(\pi\).

4.6 Theorems for the Binomial Distribution

4.6.1 Validity of the Distribution

\[\begin{aligned} \sum\limits_{x=0}^n{n\choose x}p^x(1-p)^{n-x} = 1 \end{aligned}\]

Proof:

\[\begin{aligned} \sum\limits_{x=0}^n {n\choose x} \pi^x (1-\pi)^{n-x} \\ ^{[1]} &= \big(\pi + (1-\pi)\big)^n \\ &= (1)^n \\ &= 1 \end{aligned}\]

  1. By the Binomial Theorem (6.1.2), \(\sum\limits_{x=0}^n{n\choose x}a^xb^{n-x}=(a+b)^n\)

4.6.2 Sum of Binomial Random Variables

Let \(X_1,X_2,\ldots,X_k\) be independent random variables where \(X_i\) comes from a Binomial distribution with parameters \(n_i\) and \(\pi\). That is \(X_i\sim(n_i,\pi)\).

Let \(Y = \sum\limits_{i=1}{k} X_i\). Then \(Y\sim\)Binomial\((\sum\limits_{i=1}^{k}n_i,\pi)\).

Proof:

\[\begin{aligned} M_Y(t) &= E(e^{tY}) \\ &= E(e^{t(x_1 + X_2 + \cdots + X_k)} \\ &= E(e^{tX_1} e^{tX_2} \cdots e^{tX_k}) \\ &= E(e^{tX_1}) E(e^{tX_2}) \cdots E(e^{tX_k}) \\ &= \prod\limits_{i=1}^{k} [(1-\pi)+\pi e^t]^{n_i} \\ &= [(1-\pi)+\pi e^t]^{\sum\limits_{i=1}^{k}n_i} \end{aligned}\]

Which is the mgf of a Binomial random variable with parameters \(\sum\limits_{i=1}^{k}n_i\) and \(\pi\).
Thus \(Y\sim\)Binomial\((\sum\limits_{i=1}^{k}n_i,\pi)\).

4.6.3 Sum of Bernoulli Random Variables

Let \(X_1,X_2,\ldots,X_n\) be independent and identically distributed random variables from a Bernoulli distribution with parameter \(\pi\). Let \(Y = \sum\limits_{i=1}^{n}X_i\).
Then \(Y\sim\)Binomial\((n,\pi)\)

Proof: \[\begin{aligned} M_Y(t) &= E(e^{tY}) \\ &= E(e^{tX_1} e^{tX_2} \cdots e^{tX_n}) \\ &= E(e^{tX_1}) E(e^{tX_2}) \cdots E(e^{tX_n}) \\ &= (\pi e^t+(1-\pi))(\pi e^t+(1-\pi))\cdots (\pi e^t+(1-\pi)) \\ &= (\pi e^t+(1-\pi))^n \end{aligned}\]

Which is the mgf of a Binomial random variable with parameters \(n\) and \(\pi\). Thus, \(Y\sim\) Binomial\((n,\pi)\).