Distribution of the Mean from a Normal Distribution

Here we will consider sampling a normal distribution   n   mutually independent times, and consider the density function for mean of the sample.   That is, let   X   be a normal random variable with density function   Zμ,σ , and let   X1 , X2 , … , Xn   be   n   mutually independent samples of   X .   Then
X¯=1n(X1++Xn)
will be the random variable denoting the mean of the sample.

For the purposes of determining the density function of   X¯ , we will first determine its moment generating function.

We already know how to compute the moment generating function for a linear combination of independent random variables.   That is, we have seen that if   X1 , … , Xn   are mutually independent, then
Ma1X1++anXn(t)=Ma1X1(t)Ma2X2(t)ManXn(t)=MX1(a1t)MX3(a2t)MXn(ant).
Thus
MX¯(t)=M1nX1(t)M1nX2(t)M1nXn(t)=MX1(tn)MX2(tn)MXn(tn).
Since the random variables   X1 , … , Xn   all share the same density function, this gives
MX¯(t)=MX(tn)n.

We have already determined the moment generating function for the normal random variable   X   with density   Zμ,σ   to be
MX(t)=eμt+12σ2t2.
This allows us to say that
MX¯(t)=(eμtn+12σ2(tn)2)n=eμt+12(σ2n)t2.

The special thing about this is that the resulting moment generating function looks a lot like that of   Zμ,σ .   In fact, it is precisely the moment generating function of   Zμ,σn .   We have already noted that the moment generating function for a random variable completely determines the probability density function of that random variable.   Consequently   X¯   has density   Zμ,σn .   That is, X¯   is also a normal random variable, with the same mean as   X , but a smaller standard deviation   –   i.e.   σn   rather than   σ .

Here are graphs of the density functions for   Z1,2 , Z0,1 , and   Z1,12 .   Notice how the density function with smaller standard deviation has graph much more concentrated about the mean.

increase in st dev makes graph density function for normal random variable wider and shorter
as σ decreases the graphs compress horizontally and stretch vertically
MAPLE code: display(plot([exp(-1/2*x^2)/sqrt(2*Pi), 1/(2*sqrt(2*Pi))*exp(-1/2*((x – 1)/2)^2), exp(-1/2*((x + 1)/(1/2))^2)/(1/2*sqrt(2*Pi))], x = -4 .. 4, color = [blue, red, green], thickness = 2, view = [-4 .. 4, 0 .. 1]), textplot([[-1.8, 0.7, `Z__-1,0.5`], [0.8, 0.4, `Z__0,1`], [2.5, 0.2, `Z__1,2`]]))

The above observation regarding the average of a sample from a normal random variable can be seen as a generalization of an easier fact.

Sum of Independent Normal Random Variables

Fact:   Let   X1   and   X2   be two independent random variables, with means   μ1   and   μ2 , and variances   σ12   and   σ22 , respectively.   Then
X1+X2
is a normal random variable, with mean   μ1+μ2   and variance   σ12+σ22 .

This can be seen by following the logic above.   The moment generating functions for   X1   and   X2   are
MX1=eμ1t+12σ12t2andMX2=eμ2t+12σ22t2,
respectively.   Thus
MX1+X2=MX1MX2=eμ1t+12σ12t2eμ2t+12σ22t2=e(μ1+μ2)t+12(σ12+σ22)t2
is the moment generating function for Zμ1+μ2,σ12+σ22 .   That is, X1+X2   is a normal random variable with mean   μ1+μ2   and variance   σ12+σ22 .

Here are a couple of computational examples.

Example: A standard normal random variable, X   (with density   Z0,1 ), takes values in the interval   [1.96,1.96] , symmetric about its mean, with   95%   probability.   If   X¯   is the sample mean of twenty-five samples of   X , find the interval symmetric about its mean within which   X¯   takes   95%   of its values.

Since   X   and   X¯   both have the same mean, μX=μX¯=0 , X¯   has density function   Z0,125=Z0,0.2 .   This takes   95%   of its values in the symmetric interval   [1.965,1.965]=[.392,.392] .

Example: If   X   is standard normal random variable, how large must a sample be to ensure that   X¯   takes   95%   of its values in   [.15,.15] ?

For a sample of size   n ,   X¯   will take   95%   of its values in   [1.96n,1.96n] .   Thus   [.15,.15]   will hold at least   95%   of its values if
1.96n.15.
That is,
1.96.15n,
or
(1.96.15)2=170.74n.
That is, we need a sample size of greater than   170   to ensure that our sample mean will be in   [.15,.15]   with probability at least   .95 .

Here are a couple of ways that this might be used.

Example: Experience has shown that the melting point of a certain plastic is described by a normal distribution with mean   85C   and standard deviation   4C .   Twenty samples from a new production process are tested and found to have an average melting point of   82.2C .   What is the probability that a sample of this size from the original process will have average melting point less than   82.5C ?   Does the new process yield product as stable as the original process temperatures near to   80C ?

We note that
P(X85,420<82.5)=P(X0,25<2.5)=P(X0,1<2.525)=0.013.
That is, there is less than a   1.5%   chance that a sample of the same size would exhibit such a low average melting point.

A reasonable conclusion is that the new process does not yield a plastic which is as stable as the old, at least when temperatures get above   80C .

Example: In the preceding example, the mean melting point of plastics produced by the new process needs to be determined to within   .1C   with   99%   accuracy.   Assuming that the standard deviation for the new process is   4C , just as for the old process, how large of a sample is needed to obtain such accuracy?

We address the following mathematical question: for how large of an   n   will   X¯   take   99%   of its values in   [μ.1,μ+.1] ?   Since   X¯   has density   Zμ,4n , this is asking how large to make   n   so that
P(μ.1<Zμ,4n<μ+.1).99.
That is,
P(.1<Z0,4n<.1).99,
or
P(n40<Z0,1<n40).99.
Since   P(2.58<Z0,1<2.58).99 , we need
n402.58,
or
n(402.58)2=10650.24.
Thus we need a sample size of greater than   10650   to ensure that our sample mean will be within   .1C   of its actual mean.

These examples should be compared with those encountered when we looked at Chebyshev’s theorem.   In both cases, more control over the standard deviation of a probability density gives us tighter spread about the mean.