Normal approximation for the Binomial distribution

The probability distribution function for Binomial distribution is

P(X=x) = f(x) = {n \choose x}p^{x}(1-p)^{n-x}; x = 0, 1, 2, ... n

Assume  1 - p = q

 f(x) = \frac{n!}{(n-x)!x!}p^{x}q^{n-x} \hspace{5} \cdots (1)

Stirling’s approximation for n!:

 n! = n^{n}e^{-n}\sqrt{2 \pi n} \hspace{5} \cdots (2)

Use eq (2) with eq (1)

 f(x) = \frac{n^{n}e^{-n}\sqrt{2 \pi n}}{(n-x)^{(n-x)}e^{-(n-x)}\sqrt{2 \pi (n-x)}x^{x}e^{-x}\sqrt{2 \pi x}}p^{x}q^{n-x} \hspace{5} \cdots (3)

f(x) = (\frac{p}{x})^{x} (\frac{q}{n-x})^{n-x} n^{n} \frac{1}{e^{n}} \sqrt{2 \pi} \sqrt{n} \frac{1}{\sqrt{2 \pi (n-x)}} \frac{e^{n}}{e^{x}} e^{x} \frac{1}{\sqrt{2 \pi x}} \hspace{5} \cdots (4)

f(x) =(\frac{p}{x})^{x} (\frac{q}{n-x})^{n-x} n^{n} \sqrt{\frac{n}{2 \pi x (n-x)}} \hspace{5} \cdots (5)

f(x) =(\frac{p}{x})^{x} (\frac{q}{n-x})^{n-x} n^{n} \frac{n^{x}}{n^{x}} \sqrt{\frac{n}{2 \pi x (n-x)}} \hspace{5} \cdots (6)

f(x) =(\frac{np}{x})^{x} (\frac{nq}{n-x})^{n-x} \sqrt{\frac{n}{2 \pi x (n-x)}} \hspace{5} \cdots (7)

Equation (7) has two terms.

Term 1:  (\frac{np}{x})^{x} (\frac{nq}{n-x})^{n-x}

Term 2 \sqrt{\frac{n}{2 \pi x (n-x)}}

Let’s work with Term 1 and reduce it further.

 ln \Big ( (\frac{np}{x})^{x} (\frac{nq}{n-x})^{n-x} \Big )  = x ln (\frac{np}{x}) + (n-x) ln (\frac{nq}{n-x})

We need simplifications and series expansions.

Assume  x = c + np . Remember  np is the expected value of the Binomial distribution. So,  x = c + np is simply looking at x as some deviation c from the mean or expected value of the distribution.

 ln(\frac{np}{x}) = ln(\frac{np}{np + c}) = ln(np) - ln(np + c) = - \Big ( ln(np + c) - ln(np) \Big ) = -ln(\frac{np + c}{np})  = - ln(1 + \frac{c}{np})

Now the approximate expansion (up to the second term) of  ln(1 + x) = x - \frac{x^{2}}{2}

Using this, we can represent  ln(\frac{np}{x}) as  -(\frac{c}{np} - \frac{c^{2}}{2n^{2}p^{2}})

In a similar fashion, we can reduce the second log term  ln (\frac{nq}{n-x}) as follows.

 ln(\frac{nq}{n-x}) = ln(nq) - ln(n-x)

 = -(ln(n - x) - ln(nq)

 = -(ln(n - (np+c)) - ln(nq)

 = -(ln(n(1-p)-c) - ln(nq))

 = -(ln(nq-c) - ln(nq))

 = -(ln(\frac{nq-c}{nq}))

 = -ln(1 + (\frac{-c}{nq}))

Using ln(1+x) expansion, we can write

ln (\frac{nq}{n-x}) = -(\frac{-c}{nq} - \frac{c^{2}}{2n^{2}q^{2}})

We can substitute these two approximations in Term 1.

 ln \Big ( (\frac{np}{x})^{x} (\frac{nq}{n-x})^{n-x} \Big )  = x ln (\frac{np}{x}) + (n-x) ln (\frac{nq}{n-x})

 = (np+c)(-(\frac{c}{np} - \frac{c^{2}}{2n^{2}p^{2}})) + (n-(np+c))(-(\frac{-c}{nq} - \frac{c^{2}}{2n^{2}q^{2}}))

 = (np+c)(\frac{c^{2}}{2n^{2}p^{2}} - \frac{c}{np}) + (nq-c)(\frac{c^{2}}{2n^{2}q^{2}} + \frac{c}{nq})

 = \frac{npc^{2}}{2np np} - \frac{npc}{np} + \frac{c^{3}}{2n^{2}p^{2}} - \frac{c^{2}}{np} + \frac{nqc^{2}}{2nq nq} + \frac{nqc}{nq} - \frac{c^{3}}{2n^{2}q^{2}} - \frac{c^{2}}{nq}

 = -\frac{c^{2}}{2np} - \frac{c^{2}}{2nq} -c + c + \frac{c^{3}}{2n^{2}} (\frac{1}{p^{2}} - \frac{1}{q^{2}})

 = -\frac{c^{2}}{2n}(\frac{1}{p}+\frac{1}{q}) + \frac{c^{3}}{2n^{2}} (\frac{1}{p^{2}} - \frac{1}{q^{2}})

 = -\frac{c^{2}(p+q)}{2npq} + \frac{c^{3}}{2n^{2}} (\frac{1}{p^{2}} - \frac{1}{q^{2}})

 = -\frac{c^{2}}{2npq}, assuming the second term will vanish as n \rightarrow \infty

So,

 ln \Big ( (\frac{np}{x})^{x} (\frac{nq}{n-x})^{n-x} \Big )  = -\frac{c^{2}}{2npq}

and,

 (\frac{np}{x})^{x} (\frac{nq}{n-x})^{n-x}  = e^{-\frac{c^{2}}{2npq}}

Because, x = np + c ,

 (\frac{np}{x})^{x} (\frac{nq}{n-x})^{n-x}  = e^{-\frac{(x-np)^{2}}{2npq}}

Now, let’s work with Term 2.

Term 2 \sqrt{\frac{n}{2 \pi x (n-x)}}

\sqrt{\frac{n}{2 \pi (np+c) (n-(np+c))}}

\sqrt{\frac{n}{2 \pi (np+c) (n(1-p)-c))}}

\sqrt{\frac{n}{2 \pi (np+c) (nq-c))}}

\sqrt{\frac{n}{2 \pi (n^{2}pq -ncp+ncq - c^{2})}}

\sqrt{\frac{1}{2 \pi npq - \frac{2 \pi}{n}( ncp - ncq + c^{2})}}

As n \rightarrow \infty, we can assume that the second term in the denominator vanishes leaving,

\sqrt{\frac{1}{2 \pi npq}

Substituting these two terms (approximation for Term 1 and approximation for Term 2) in equation (7), we get

f(x) =(\frac{np}{x})^{x} (\frac{nq}{n-x})^{n-x} \sqrt{\frac{n}{2 \pi x (n-x)}} = \frac{1}{\sqrt{2 \pi npq}} e^{-\frac{(x-np)^{2}}{2npq}}\hspace{5} \cdots (8)

For the binomial distribution, the expected value \mu = np and the variance \sigma^{2} = npq.

Using these notations with equation (8), we get the an approximation for the Binomial distribution.

f(x) = \frac{n!}{(n-x)!x!}p^{x}q^{n-x} = \frac{1}{\sqrt{2 \pi \sigma^{2}}} e^{-\frac{(x-\mu)^{2}}{2\sigma^{2}}} \hspace{5} \cdots (9)

or,

f(x) = \frac{n!}{(n-x)!x!}p^{x}q^{n-x} = \frac{1}{\sqrt{2 \pi \sigma^{2}}} e^{\frac{-1}{2}(\frac{x-\mu}{\sigma})^{2}}  \hspace{5} \cdots (10)

Equation (10) is the probability density function of the normal distribution (the bell shape).