20 Geometric Distribution (1)

The Geometric Distribution random variable may be considered as the number of trials required to achieve the first success. The support of the distribution is the natural numbers. In this consideration, if $$x$$ is the total number of trials and $$k$$ is the number of failures, $$x = k+1$$.

20.1 Probability Mass Function

A random variable $$X$$ is said to have a Geometric Distribution with parameter $$p$$ if its probability mass function is

$p(x) = \left\{ \begin{array}{ll} p(1-p)^{x-1}, & x=1,2,3,\ldots\\ 0 & otherwise \end{array} \right.$

where $$p$$ is the probability of a success on any given trial and $$x$$ is the number of trials until the first success.

20.2 Cumulative Distribution Function

The cumulative probability of $$x$$ is computed as the $$x^{th}$$ partial sum of the Geometric Series See 21.1.

\begin{aligned} P(x) &= \sum\limits_{i=1}^{x}p(1-p)^{x-1} \\ &= \frac{p-p(1-p)^{x-1}}{1-(1-p)} \\ &= \frac{p[1-(1-p)^{x-1}]}{p} \\ &= 1-(1-p)^{x-1} \end{aligned}

So

$P(x) = \left\{ \begin{array}{ll} 1-(1-p)^{x-1},& x=1,2,3,\ldots\\ 0 & otherwise \end{array} \right.$

A recursive form of the cdf can be derived and has some usefulness in computer applications. With it, one need only initiate the first value and additional cumulative probabilities can be calculated. It is derived as follows:

\begin{aligned} P(X=x+1) &= p(1-p)^x \\ &= (1-p)p(1-p)^{x-1} \\ &= (1-p)P(X=x) \end{aligned}

20.3 Expected Values

\begin{aligned} E(X) &= \sum\limits_{x=1}^{\infty}x\cdot p(1-p)^{x-1} \\ &= p\sum\limits_{x=1}^{\infty}x\cdot (1-p)^{x-1} \\ ^{[1]} &= p\sum\limits_{x=1}^{\infty}x\cdot q^{x-1} \\ ^{[2]} &= p\frac{d}{dq}\Big(\sum\limits_{x=1}^{\infty}q^x\Big) \\ &= p\frac{d}{dq}\Big(\sum\limits_{x=1}^{\infty}q\cdot q^{x-1}\Big) \\ ^{[3]} &= p\frac{d}{dq}\Big(\frac{q}{1-q}\Big) \\ ^{[4]} &= p\Big(\frac{(1-q)\cdot 1-q(-1)}{(1-q)^2}\Big) \\ &= p\Big(\frac{1-q+q}{(1-q)^2}\Big) \\ &= p\Big(\frac{1}{(1-q)^2}\Big) \\ ^{[5]} &= p\cdot\frac{1}{p^2} &= \frac{p}{p^2} &= \frac{1}{p} \end{aligned}

1. Let $$q = 1 - p$$
2. $$a\cdot x^{a-1}=\frac{d}{dx}(x^a)$$
3. $$\sum\limits_{k=1}^{\infty}ar^{k-1}=\frac{a}{1-r}$$ (Geometric Series 21.1).
4. Product Rule for Differentiation:
$$\frac{d}{dx}(\frac{f(x)}{g(x)}) = \frac{g\prime(x)\cdot f(x)-f\prime(x)\cdot g(x)}{g^2(x)}$$
5. $$1-p=q \ \ \Rightarrow p=1-q$$

\begin{aligned} E[X(X-1)] ^{[1]} &= \sum\limits_{x=2}^{\infty}x(x-1)p(1-p)^{x-1} \\ &= p(1-p)\sum\limits_{x=2}^{\infty}x(x-1)(1-p)^{x-2} \\ ^{[2]} &= pq\sum\limits_{x=2}^{\infty}x(x-1)q^{x-2} \\ ^{[3]} &= pq\frac{d^2}{dq^2}\Big(\sum\limits_{x=2}^{\infty}q^x\Big) \\ &= pq\frac{d^2}{dq^2}\Big(\sum\limits_{x=2}^{\infty}q\cdot q^{x-1}\Big) \\ ^{[4]} &= pq\frac{d^2}{dq^2}\Big(\frac{2q-1}{1-q}\Big) \\ ^{[5]} &= pq\frac{d}{dq}(-(1-q)^{-2}) \\ ^{[6]} &= pq\frac{2}{(1-q)^3} \\ &= \frac{2pq}{(1-q)^3} \\ ^{[7]} &= \frac{2p(1-p)}{p^3} \\ & =\frac{2(1-p)}{p^2} \end{aligned}

1. We start the summand at $$x=2$$ because the term for $$x=1$$ is 0.
2. Let $$1-p = 1$$
3. $$a(a-1)x^{a-2}=\frac{d^2}{dx^2}x^a$$
4. $$\sum\limits_{k=1}^{\infty}ar^{k-1} = \frac{a}{1-r}=1+a+ar+ar^2+\cdots.$$ The current series leaves out the first term,\   $$\sum\limits_{k=2}^{\infty}ar^{k-1}=(\sum\limits_{k=1}^{\infty}ar^{k-1})-1 = \frac{a}{1-r}-1=\frac{1}{1-r}-\frac{1-r}{1-r}=\frac{a-(1-r)}{1-r}=\frac{a+r-1}{r-1}$$
5. $$\frac{d}{dx}(\frac{2x-1}{1-x})=\frac{-(2x-1)-2(1-x)}{(1-x)^2} = \frac{2x+1-2+2x}{(1-x)^2}=\frac{-1}{(1-x)^2}=-(1-x)^{-2}$$
6. $$\frac{d}{dx}-(1-x)^{-2}=2(1-x)^{-3}=\frac{2}{(1-x)^3}$$
7. See note number 5.

\begin{aligned} \mu &= E(X) \\ &= \frac{1}{p} \\ \\ \\ \sigma^2 &= E(X^2) - E(X)^2 \\ &= E(X^2) - E(X) + E(X) - E(X)^2 \\ &= [E(X^2)-E(X)]+E(X)-E(X)^2 \\ &= E(X^2-X)+E(X)-E(X)^2 \\ &= E[(X(X-1)]+E(X)-E(X)^2 \\ &= \frac{2(1-p)}{p^2} + \frac{1}{p} - \frac{1}{p^2} \\ &= \frac{2(1-p)}{p^2} + \frac{p}{p^2} - \frac{1}{p^2} \\ &= \frac{2(1-p)+p-1}{p^2} \\ &= \frac{2-2p+p-1}{p^2} \\ &= \frac{2-1+p-2p}{p^2} \\ &= \frac{1-p}{p^2} \end{aligned}

20.4 Moment Generating Function

\begin{aligned} M_X(t) &= E(e^{tX}) \\ &= \sum\limits_{x=1}^{\infty}e^{tx}p(1-p)^{x-1} \\ &= p\sum\limits_{x=1}^{\infty}e^{tx}(1-p)^{x-1} \\ &= p\sum\limits_{x=1}^{\infty}e^{t^x}(1-p)^{x-1} \\ &= pe^t\sum\limits_{x=1}^{\infty}e^{t^{(x-1)}} \\ &= pe^t\sum\limits_{x=1}^{\infty}[(1-p)e^t]^{x-1} \\ ^{[1]} &= pe^t\frac{1}{1-(1-p)e^t} \\ &= \frac{pe^t}{1-(1-p)e^t} \end{aligned}

1. $$\sum\limits_{k=1}^{\infty}ar^{k-1}=\frac{a}{1-r}$$, (Geometric Series 21.1)

\begin{aligned} M_X^{(1)}(t) &= \frac{[1-(1-p)e^t]pe^t-pe^t[-(1-p)e^t]}{\big(1-(1-p)e^t\big)^2} \\ &= \frac{pe^t[1-(1-p)e^t+(1-pe^t)]}{\big(1-(1-p)e^t\big)^2} \\ &= \frac{pe^t(1)}{\big(1-(1-p)e^t\big)^2} \\ &= \frac{pe^t}{\big(1-(1-p)e^t\big)^2} \\ \\ \\ M_X^{(2)}(t) &= \frac{(1-(1-p)e^t)^2pe^t-pe^t[-2(1-(1-p)e^t)(1-p)e^t}{\big(1-(1-p)e^t\big)^4} \\ &= \frac{pe^t[(1-(1-p)e^t)^2+2(1-(1-p)e^t(1-p)e^t]}{\big(1-(1-p)e^t\big)^4} \\ \\ \\ E(X) &= M_X^{(1)}(0) \\ &= \frac{pe^0}{\big(1-(1-p)e^0\big)^2} \\ &= \frac{p}{\big(1-(1-p)\big)^2} \\ &= \frac{p}{(1-1+p)^2} \\ &= \frac{p}{p^2} \\ &= \frac{1}{p}\\ \\ \\ E(X^2) &= M_X^{(2)}(0) \\ &= \frac{pe^0[(1-(1-p)e^0)^2+2(1-(1-p)e^0)(1-p)e^0]}{(1-(1-p)e^0)^4} \\ &= \frac{p[(1-(1-p))^2+2(1-(1-p))(1-p)]}{(1-(1-p))^4} \\ &= \frac{p[(1-1+p)^2+2(1-1+p)(1-p)]}{(1-1+p)^4} \\ &= \frac{p[p^2=2p(1-p)]}{p^4} \\ &= \frac{p(p^2+2p-2p^2)}{p^4} \\ &= \frac{p(2p-p^2)}{p^4} \\ &= \frac{p^2(2-p)}{p^4} \\ &= \frac{2-p}{p^2} \\ \\ \\ \mu &= E(X) \\ &= \frac{1}{p}\\ \\ \\ \sigma^2 &= E(X^2) - E(X)^2 \\ &= \frac{2-p}{p^2} - \frac{1}{p^2} \\ &= \frac{1-p}{p^2} \end{aligned}

20.5 Maximum Likelihood Estimator

Let $$x_1, x_2 , \ldots , x_n$$ be a random sample from a Geometric distribution with parameter $$p$$.

20.5.1 Likelihood Function

\begin{aligned} L(\theta) &= P(x_1,x_2,\ldots,x_n|\theta) \\ &= p(1-p)^{x-1} \end{aligned}

20.5.2 Log-likelihood Function

\begin{aligned} \ell(\theta) &= \ln p(1-p)^{x-1} \\ &= \ln p+\ln(1-p)^{x-1} \\ &= \ln p+(x-1)\ln(1-p) \end{aligned}

20.5.3 MLE for $$p$$

\begin{aligned} \frac{d\ell}{dp} &= \frac{1}{p} + \frac{-(x-1)}{1-p} \\ &= \frac{1}{p} + \frac{1-x}{1-p}\\ \\ \\ 0 &= \frac{1}{p}+\frac{1-x}{1-p}\\ \Rightarrow \frac{1-x}{1-p} &= -\frac{1}{p}\\ \Rightarrow 1-x &= \frac{-1+p}{p}\\ \Rightarrow -x &= \frac{-1+p}{p}-1\\ \Rightarrow x=1-\frac{-1+p}{p} &= \frac{p}{p}-\frac{-1+p}{p} \\ &= \frac{p+1-p}{p} \\ &= \frac{1}{p}\\ \Rightarrow p &= \frac{1}{x} \end{aligned}

So $\hat p=\frac{1}{x}$ is the Maximum Likelihood Estimator for $$p$$.

20.6 Theorems for the Geometric Distribution

20.6.1 Validity of the Distribution

$\sum\limits_{i=1}^{\infty}p(1-p)^{x-1} = 1$

Proof:

\begin{aligned} \sum\limits_{i=1}^{\infty}p(1-p)^{x-1} &= \frac{p}{1-(1-p)} \\ &= \frac{p}{p} \\ &= 1 \end{aligned}

1. $$S=\lim\limits_{k\rightarrow\infty}S_k =\lim\limits_{k\rightarrow\infty}\frac{a-ar^k}{1-r}$$ (Geometric Series 21.1)

20.6.2 Sum of Geometric Random Variables

Let $$X_1,X_2,\ldots,X_n$$ be independent random variables from a Geometric Distribution with parameter $$p$$, that is, $$X_i\sim$$Geometric$$(p)$$, $$i=1,2,3,\ldots$$. Let $$Y=\sum\limits_{i=1}^{n}X_i$$. Then $$Y\sim$$Negative Binomial$$(n,p)$$.

Proof:

\begin{aligned} M_Y(t) &= E(e^{tY}) \\ &= E(e^{t(X_1+X_2+\cdots+X_n)}) \\ &= E(e^{tX_1}e^{tX_2}\cdots e^{tX_n}) \\ &= E(e^{tX_1})E(e^{tX_2})\cdots E(e^{tX_n}) \\ &=\frac{pe^t}{1-(1-p)e^t}\cdot\frac{pe^t}{1-(1-p)e^t}\cdot\ \cdots \ \cdot\frac{pe^t}{1-(1-p)e^t} \\ &= \Big(\frac{pe^t}{1-(1-p)e^t}\Big)^n \end{aligned}

Which is the mgf of a Negative Binomial random variable with parameters $$n$$ and $$p$$. Thus $$Y\sim$$Negative Binomial$$(n,p)$$.

20.6.3 Lemma

Let $$X$$ be a Geometric random variable with parameter $$p$$. Then $$P(X>a)=(1-p)^a$$.

Proof:

\begin{aligned} P(X>a) &= 1-P(X\leq a)=1-\sum\limits_{i=1}^{a}p(1-p)^i-1 \\ ^{[1]} &= 1-\frac{p-p(1-p)^a}{1-(1-p)} \\ &= 1-\frac{p\big(1-(1-p)^a\big)}{1-1+p} \\ &= 1-\frac{p\big(1-(1-p)^a\big)}{p} \\ &= 1-\big(1-(1-p)^a\big) \\ &= 1-1+(1-p)^a \\ &= (1-p)^a \end{aligned}

1. $$S_k=\lim\limits_{k\rightarrow\infty}\frac{a-ar^k}{1-r}$$ (Geometric Series 21.1)

20.6.4 Memoryless Property

$P(X\geq k+j|X\geq k)=P(X\geq k)$

Proof:

\begin{aligned} P(X > a+b) ^{[1]} &= (1-p)^{a+b}=(1-p)^a(1-p)^b \\ \\ \\ P(X > k+j|X > k) &=\frac{P(X > k+j\cap X > k)}{P(X > k)} \\ ^{[2]} &= \frac{P(X > k+j)}{P(X > k)} \\ &= \frac{(1-p)^k(1-p)^j}{(1-p)^k} \\ &= (1-p)^j \\ &= P(X > j) \end{aligned}

1. Geometric Distribution 20.6.3
2. Since $$j$$ must be a positive integer in the Geometric Distribution, it is certain that \$$(k+j)\cap k=k+j$$.