# Distribution of the Mean from a Normal Distribution

Here we will consider sampling a normal distribution   $n$   mutually independent times, and consider the density function for mean of the sample.   That is, let   $X$   be a normal random variable with density function   $\displaystyle{ Z_{\mu,\sigma} }\,$ , and let   $\displaystyle{ X_1 }\,$ , $\,\displaystyle{ X_2 }\,$ , … , $\,\displaystyle{ X_n }$   be   $n$   mutually independent samples of   $X\;$ .   Then
$$\bar{X} =\textstyle{\frac{1}{n}}\left( X_1 + \cdots + X_n \right)$$
will be the random variable denoting the mean of the sample.

For the purposes of determining the density function of   $\bar{X}\,$ , we will first determine its moment generating function.

We already know how to compute the moment generating function for a linear combination of independent random variables.   That is, we have seen that if   $\displaystyle{ X_1 }\,$ , … , $\,\displaystyle{ X_n }$   are mutually independent, then
$$M_{a_1\, X_1 +\cdots +a_n\, X_n}(t) =M_{a_1\, X_1}(t) \cdot M_{a_2\, X_2}(t) \cdot \, \cdots\, \cdot M_{a_n\, X_n}(t) =M_{X_1}(a_1\, t) \cdot M_{X_3}(a_2\, t)\cdot \, \cdots \, \cdot M_{X_n}(a_n\, t) \;\text{.}$$
Thus
$$M_{\bar{X}}(t) =M_{\frac{1}{n}X_1}(t)\cdot M_{\frac{1}{n}X_2}(t)\cdot\,\cdots\,\cdot M_{\frac{1}{n}X_n}(t) =M_{X_1}\left(\frac{t}{n}\right)\cdot M_{X_2}\left(\frac{t}{n}\right)\cdot\,\cdots\,\cdot M_{X_n} \left(\frac{t}{n}\right) \;\text{.}$$
Since the random variables   $\displaystyle{ X_1 }\,$ , … , $\,\displaystyle{ X_n }$   all share the same density function, this gives
$$M_{\bar{X}}(t) =M_{X}\left( \frac{t}{n}\right)^n \;\text{.}$$

We have already determined the moment generating function for the normal random variable   $X$   with density   $\displaystyle{ Z_{\mu,\sigma} }$   to be
$$M_X(t) ={\rm e}^{\mu\, t +\frac{1}{2}\,\sigma^2\, t^2} \;\text{.}$$
This allows us to say that
$$M_{\bar{X}}(t) =\left( {\rm e}^{\mu\frac{t}{n} +\frac{1}{2}\,\sigma^2\left(\frac{t}{n}\right)^2 } \right)^n ={\rm e}^{\mu t +\frac{1}{2}\,\left(\frac{\sigma^2}{n}\right)\, t^2 } \;\text{.}$$

The special thing about this is that the resulting moment generating function looks a lot like that of   $\displaystyle{ Z_{\mu,\sigma} }\;$ .   In fact, it is precisely the moment generating function of   $\displaystyle{ Z_{\mu,\frac{\sigma}{\sqrt{n}}} }\;$ .   We have already noted that the moment generating function for a random variable completely determines the probability density function of that random variable.   Consequently   $\bar{X}$   has density   $\displaystyle{ Z_{\mu,\frac{\sigma}{\sqrt{n}}} }\;$ .   That is, $\,\bar{X}$   is also a normal random variable, with the same mean as   $X\,$ , but a smaller standard deviation   –   i.e.   $\displaystyle{ \frac{\sigma}{\sqrt{n}} }$   rather than   $\sigma\;$ .

Here are graphs of the density functions for   $\displaystyle{ Z_{1,2} }\,$ , $\displaystyle{ Z_{0,1} }\,$ , and   $\displaystyle{ Z_{-1,\frac{1}{2}} }\;$ .   Notice how the density function with smaller standard deviation has graph much more concentrated about the mean.

MAPLE code: display(plot([exp(-1/2*x^2)/sqrt(2*Pi), 1/(2*sqrt(2*Pi))*exp(-1/2*((x – 1)/2)^2), exp(-1/2*((x + 1)/(1/2))^2)/(1/2*sqrt(2*Pi))], x = -4 .. 4, color = [blue, red, green], thickness = 2, view = [-4 .. 4, 0 .. 1]), textplot([[-1.8, 0.7, Z__-1,0.5], [0.8, 0.4, Z__0,1], [2.5, 0.2, Z__1,2]]))

The above observation regarding the average of a sample from a normal random variable can be seen as a generalization of an easier fact.

## Sum of Independent Normal Random Variables

Fact:   Let   $\displaystyle{ X_1 }$   and   $\displaystyle{ X_2 }$   be two independent random variables, with means   $\displaystyle{ \mu_1 }$   and   $\displaystyle{ \mu_2 }\,$ , and variances   $\displaystyle{ \sigma_1^2 }$   and   $\displaystyle{ \sigma_2^2 }\,$ , respectively.   Then
$$X_1 + X_2$$
is a normal random variable, with mean   $\displaystyle{ \mu_1 +\mu_2 }$   and variance   $\displaystyle{ \sigma_1^2 +\sigma_2^2 }\;$ .

This can be seen by following the logic above.   The moment generating functions for   $\displaystyle{ X_1 }$   and   $\displaystyle{ X_2 }$   are
$$M_{X_1} ={\rm e}^{\mu_1 t +\frac{1}{2}\sigma_1^2 t^2} \qquad \text{and} \qquad M_{X_2} ={\rm e}^{\mu_2 t +\frac{1}{2}\sigma_2^2 t^2} \,\text{,}$$
respectively.   Thus
$$\begin{array}{rl} M_{X_1 +X_2} & =M_{X_1} M_{X_2} \\ & \\ & ={\rm e}^{\mu_1 t +\frac{1}{2}\sigma_1^2 t^2} {\rm e}^{\mu_2 t +\frac{1}{2}\sigma_2^2 t^2} \\ & \\ & ={\rm e}^{\left(\mu_1 +\mu_2\right) t +\frac{1}{2} \left(\sigma_1^2 +\sigma_2^2\right) t^2} \end{array}$$
is the moment generating function for $\displaystyle{ Z_{\mu_1 +\mu_2 , \sqrt{\sigma_1^2 +\sigma_2^2}} }\;$ .   That is, $\displaystyle{ X_1 +X_2 }$   is a normal random variable with mean   $\displaystyle{ \mu_1 +\mu_2 }$   and variance   $\displaystyle{ \sigma_1^2 +\sigma_2^2 }\;$ .

Here are a couple of computational examples.

## Example: A standard normal random variable, $\,X$   (with density   $\displaystyle{ Z_{0,1} }\,$ ), takes values in the interval   $[-1.96, 1.96]\,$ , symmetric about its mean, with   $95\%$   probability.   If   $\bar{X}$   is the sample mean of twenty-five samples of   $X\,$ , find the interval symmetric about its mean within which   $\bar{X}$   takes   $95\%$   of its values.

Since   $X$   and   $\bar{X}$   both have the same mean, $\,\displaystyle{ \mu_X =\mu_{\bar{X}} =0 }\,$ , $\,\bar{X}$   has density function   $\,\displaystyle{ Z_{0,\frac{1}{\sqrt{25}}} =Z_{0,0.2} }\;$ .   This takes   $95\%$   of its values in the symmetric interval   $\left[ -\frac{1.96}{5}, \frac{1.96}{5} \right] =[-.392, .392]\;$ .

## Example: If   $X$   is standard normal random variable, how large must a sample be to ensure that   $\bar{X}$   takes   $95\%$   of its values in   $[-.15,.15]\;$ ?

For a sample of size   $n\,$ ,   $\bar{X}$   will take   $95\%$   of its values in   $\left[ -\frac{1.96}{\sqrt{n}}, \frac{1.96}{\sqrt{n}} \right]\;$ .   Thus   $[-.15,.15]$   will hold at least   $95\%$   of its values if
$$\frac{1.96}{\sqrt{n}} \le .15 \;\text{.}$$
That is,
$$\frac{1.96}{.15} \le \sqrt{n} \,\text{,}$$
or
$$\left(\frac{1.96}{.15}\right)^2 =170.74 \le n \;\text{.}$$
That is, we need a sample size of greater than   $170$   to ensure that our sample mean will be in   $[-.15,.15]$   with probability at least   $.95\;$ .

Here are a couple of ways that this might be used.

## Example: Experience has shown that the melting point of a certain plastic is described by a normal distribution with mean   $85C$   and standard deviation   $4C\;$ .   Twenty samples from a new production process are tested and found to have an average melting point of   $82.2C\;$ .   What is the probability that a sample of this size from the original process will have average melting point less than   $82.5C\;$ ?   Does the new process yield product as stable as the original process temperatures near to   $80C\;$ ?

We note that
$$\begin{array}{rl} P\left( X_{85,\frac{4}{\sqrt{20}}} \lt 82.5\right) & =P\left(X_{0,\frac{2}{\sqrt{5}}} \lt -2.5\right) \\ & \\ & =P\left( X_{0,1} \lt -2.5\cdot \frac{2}{\sqrt{5}} \right) \\ & \\ & =0.013 \;\text{.} \end{array}$$
That is, there is less than a   $1.5\%$   chance that a sample of the same size would exhibit such a low average melting point.

A reasonable conclusion is that the new process does not yield a plastic which is as stable as the old, at least when temperatures get above   $80C\;$ .

## Example: In the preceding example, the mean melting point of plastics produced by the new process needs to be determined to within   $.1C$   with   $99\%$   accuracy.   Assuming that the standard deviation for the new process is   $4C\,$ , just as for the old process, how large of a sample is needed to obtain such accuracy?

We address the following mathematical question: for how large of an   $n$   will   $\bar{X}$   take   $99\%$   of its values in   $\left[ \mu -.1, \mu +.1 \right]\;$ ?   Since   $\bar{X}$   has density   $\displaystyle{ Z_{\mu,\frac{4}{\sqrt{n}}} }\,$ , this is asking how large to make   $n$   so that
$$P\left( \mu -.1 \lt Z_{\mu,\frac{4}{\sqrt{n}}} \lt \mu +.1 \right) \ge .99 \;\text{.}$$
That is,
$$P\left( -.1 \lt Z_{0,\frac{4}{\sqrt{n}}} \lt .1 \right) \ge .99 \,\text{,}$$
or
$$P\left( -\frac{\sqrt{n}}{40} \lt Z_{0,1} \lt \frac{\sqrt{n}}{40} \right) \ge .99 \;\text{.}$$
Since   $P\left( -2.58 \lt Z_{0,1} \lt 2.58 \right) \ge .99 \,$ , we need
$$\frac{\sqrt{n}}{40} \ge 2.58 \,\text{,}$$
or
$$n\ge ( 40\cdot 2.58)^2 =10650.24 \;\text{.}$$
Thus we need a sample size of greater than   $10650$   to ensure that our sample mean will be within   $.1C$   of its actual mean.

These examples should be compared with those encountered when we looked at Chebyshev’s theorem.   In both cases, more control over the standard deviation of a probability density gives us tighter spread about the mean. 