$30
Project: Central Limit Theorem
Probability and Statistics with Applications to Computing
0. Introduction and Background Material
0.1. Simulating a R.V. with Uniform Probability Distribution
The Python function "numpy.random.uniform(a,b,n)" will generate n random
numbers with uniform probability distribution in the open interval [a b, ) .
The PDF of a random variable uniformly distributed in [a b, ) is defined as following:
𝑓𝑓(𝑥𝑥) = �
1
(𝑏𝑏−𝑎𝑎)
, 𝑎𝑎 ≤ 𝑥𝑥 ≤ 𝑏𝑏
0, otherwise
; and
0,
( ) ( ) () , ( )
1,
x a
x a PX x Fx a x b
b a
x b
<
− ≤ = = ≤ < −
≥
It is noted that the mean and variance of a uniformly distributed random variable X are
given by:
2
2 ( ) E( ) ; Var( ) 2 12 X X
a b b a X X µ σ
+ − = = = =
EE 381 Project: Central Limit Theorem Dr. Chassiakos -- Page 2
0.2. Simulating a R.V. with Exponential Probability Distribution
The Python function "numpy.random.exponential(beta,n)" will generate n
random numbers with exponential probability distribution.
The PDF of a random variable exponentially distributed is defined as following:
𝑓𝑓𝑇𝑇(𝑡𝑡 ; 𝛽𝛽) = �
1
𝛽𝛽 𝑒𝑒𝑒𝑒𝑒𝑒( − 1
𝛽𝛽 𝑡𝑡), 𝑡𝑡 ≥ 0
0, 𝑡𝑡 < 0
From the above definition, the CDF of T is found as:
0, 0
( ) () 1 1 exp( ), 0
t
PT t Ft
t t
β
< ≤= = −− ≥
It is noted that the mean and standard deviation of the exponentially distributed random
variable T are given by:
𝜇𝜇𝑇𝑇 = 𝛽𝛽 ; 𝜎𝜎𝑇𝑇 = 𝛽𝛽
0.3. Simulating a R.V. with Normal Probability Distribution
The Python function "numpy.random.normal(mu, sigma, n)" will generate n
random numbers from a Gaussian probability distribution (also called normal
probability distribution) with mean 𝜇𝜇𝑋𝑋 = mu and standard deviation 𝜎𝜎𝑋𝑋 = sigma.
The PDF of a normal random variable 𝑋𝑋 with mean 𝜇𝜇𝑋𝑋 and standard deviation 𝜎𝜎𝑋𝑋 is
defined as following:
𝑓𝑓(𝑥𝑥) = 1
𝜎𝜎𝑋𝑋√2𝜋𝜋
exp{ − (𝑥𝑥 − 𝜇𝜇𝑋𝑋)2
2𝜎𝜎𝑋𝑋
2 }
It is noted that the mean and variance of the normally distributed random variable 𝑋𝑋 are
given by:
𝐸𝐸(𝑋𝑋) = 𝜇𝜇𝑋𝑋 ; Var(𝑋𝑋) = 𝜎𝜎𝑋𝑋
2
EE 381 Project: Central Limit Theorem Dr. Chassiakos -- Page 3
0.4. Central Limit Theorem
If 1 2 , , XX X n are independent random variables having the same probability
distribution with mean µ and standard deviation σ , consider the sum
n n 1 2 S XX X =++ .
This sum n S is a random variable with mean n S µ µ = n and standard deviation
n S σ σ= n .
The Central Limit Theorem states that as n → ∞ the probability distribution of the
R.V. n S will approach a normal distribution with mean n µS and standard deviation
n σ S , regardless of the original distribution of the R.V. 1 2 , , XX X n .
It is noted that the PDF of the normally distributed R.V. n S is given by:
2
2
1 ( ) ( ) exp{ } 2 2
n
n n
S
n
S S
x
f s µ
σ π σ
− = −
EE 381 Project: Central Limit Theorem Dr. Chassiakos -- Page 4
PROBLEMS
1. Simulate continuous random variables with selected distributions
1.1 Simulate a Uniform Random Variable.
a) Create a random variable 𝑋𝑋 with a uniform distribution. Use the Python
function "numpy.random.uniform(a,b,n)" to generate n values of the
R.V. 𝑋𝑋 of with uniform probability distribution in the open interval [a b, ) .
b) Use the histogram function to plot a bargraph of the experimental values of
the R.V. 𝑋𝑋. On the same graph plot the probability density function for the
R.V. 𝑋𝑋, given by
𝑓𝑓(𝑥𝑥) = �
1
(𝑏𝑏−𝑎𝑎)
, 𝑎𝑎 ≤ 𝑥𝑥 ≤ 𝑏𝑏
0, otherwise
and compare to the bargraph plot.
c) Calculate the expectation and standard deviation of the R.V. 𝑋𝑋, using the
Python functions numpy.mean and numpy.std . Compare to the theoretical
values given by 𝜇𝜇𝑋𝑋 = 𝑎𝑎+𝑏𝑏
2 ; 𝜎𝜎𝑋𝑋
2 = (𝑏𝑏−𝑎𝑎)2
12
d) Your report should contain the graph require in (b) and the values required
in (c) tabulated in the following table. The graph should be properly labeled.
e) A sample code for (a-c) is given below.
Table 1: Statistics for a Uniform Distribution
Expectation Standard Deviation
Theoretical
Calculation
Experimental
Measurement
Theoretical
Calculation
Experimental
Measurement
Table 1. Calculations for a Uniform distribution
EE 381 Project: Central Limit Theorem Dr. Chassiakos -- Page 5
# The following code provides a way to create the bar graph of a
# uniform probability distribution in the interval [a,b)
# where a=1, b=3.
# The code generates n=10000 values of the random variable.
# This is only a sample code. Your project has different values
# of a and b. You must use the correct values for your project
#
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
# Generate the values of the RV X
a=1; b=3; n=10000;
x=np.random.uniform(a,b,n)
# Create bins and histogram
nbins=30; # Number of bins
edgecolor='w'; # Color separating bars in the bargraph
#
bins=[float(x) for x in np.linspace(a, b,nbins+1)]
h1, bin_edges = np.histogram(x,bins,density=True)
# Define points on the horizontal axis
be1=bin_edges[0:np.size(bin_edges)-1]
be2=bin_edges[1:np.size(bin_edges)]
b1=(be1+be2)/2
barwidth=b1[1]-b1[0] # Width of bars in the bargraph
plt.close('all')
# PLOT THE BAR GRAPH
fig1=plt.figure(1)
plt.bar(b1,h1, width=barwidth, edgecolor=edgecolor)
#PLOT THE UNIFORM PDF
def UnifPDF(a,b,x):
f=(1/abs(b-a))*np.ones(np.size(x))
return f
f=UnifPDF(1,3,b1)
plt.plot(b1,f,'r')
#CALCULATE THE MEAN AND STANDARD DEVIATION
mu_x=np.mean(x)
sig_x=np.std(x)
EE 381 Project: Central Limit Theorem Dr. Chassiakos -- Page 6
1.2 Simulate an Exponentially distributed Random Variable.
a) Create a random variable 𝑇𝑇 with an exponential distribution. Use the Python
function "numpy.random.exponential(beta,n)" to generate n values of
the R.V. 𝑇𝑇 of with exponential probability distribution.
b) Use the histogram function to plot a bargraph of the experimental values of
the R.V. 𝑇𝑇. On the same graph plot the probability density function for the
R.V. 𝑇𝑇, given by
𝑓𝑓𝑇𝑇(𝑡𝑡 ; 𝛽𝛽) = �
1
𝛽𝛽 𝑒𝑒𝑒𝑒𝑒𝑒( − 1
𝛽𝛽 𝑡𝑡), 𝑡𝑡 ≥ 0
0, 𝑡𝑡 < 0
and compare to the bargraph plot.
c) Calculate the expectation and standard deviation of the R.V. 𝑋𝑋, using the
Python functions numpy.mean and numpy.std . Compare to the theoretical
values given by 𝜇𝜇𝑇𝑇 = 𝛽𝛽 ; 𝜎𝜎𝑇𝑇 = 𝛽𝛽
d) Your report should contain the graph require in (b) and the values required
in (c) tabulated in the following table. The graph should be properly labeled.
e) Modify the sample code for (a-c) given previously.
Table 2: Statistics for Exponential Distribution
Expectation Standard Deviation
Theoretical
Calculation
Experimental
Measurement
Theoretical
Calculation
Experimental
Measurement
Table 2. Calculations for Exponential distribution
EE 381 Project: Central Limit Theorem Dr. Chassiakos -- Page 7
1.3 Simulate a Normal Random Variable.
a) Create a random variable 𝑋𝑋 with a normal distribution. Use the Python
function "numpy.random.normal(mu,sigma,n)" to generate n values of
the R.V. 𝑋𝑋 of with normal probability distribution.
b) Use the histogram function to plot a bargraph of the experimental values of
the R.V. 𝑋𝑋. On the same graph plot the probability density function for the
R.V. 𝑋𝑋, given by
𝑓𝑓(𝑥𝑥) = 1
𝜎𝜎√2𝜋𝜋
exp{ − (𝑥𝑥 − 𝜇𝜇)2
2𝜎𝜎2 }
and compare to the bargraph plot.
c) Calculate the expectation and standard deviation of the R.V. 𝑋𝑋, using the
Python functions numpy.mean and numpy.std . Compare to the theoretical
values given by 𝜇𝜇𝑋𝑋 = 𝜇𝜇 ; 𝜎𝜎𝑋𝑋 = 𝜎𝜎
d) Your report should contain the graph required in (b) and the values required
in (c) tabulated in the following table. The graph should be properly labeled.
e) Modify the sample code for (a-c) given previously.
Table 3: Statistics for Normal Distribution
Expectation Standard Deviation
Theoretical
Calculation
Experimental
Measurement
Theoretical
Calculation
Experimental
Measurement
Table 3. Calculations for Normal distribution
EE 381 Project: Central Limit Theorem Dr. Chassiakos -- Page 8
2. The Central Limit Theorem
Central Limit Theorem.
Consider a collection of books, each of which has thickness W . The thickness W is a
RV, uniformly distributed between a minimum of a and a maximum of b cm. Use
the values of a and b that were provide to you, and calculate the mean and
standard deviation of the thickness. Use the following table to report the results.
Points will be taken off if you do not use the table to report .
Mean thickness of a single book (cm) Standard deviation of thickness (cm)
µ w = σ w =
The books are piled in stacks of n =1,5,10, or 15 books. The width n S of a stack of n
books is a RV (the sum of the widths of the n books). This RV has a mean n µ µ S w = n
and a standard deviation of n S w σ σ= n .
Calculate the mean and standard deviation of the stacked books, for the different
values of n = 1,5,10, or 15 . Use the following table to report the results. Points will
be taken off if you do not use the table to report.
Number of
books n
Mean thickness of a stack of n
books (cm)
Standard deviation of the
thickness for n books
n =1 µ w = σ w =
n =5 µ w = σ w =
n =15 µ w = σ w =
Perform the following simulation experiments, and plot the results.
a) Make n =1and run N =10,000 experiments, simulating the RV 1 S W= .
b) After the N experiments are completed, create and plot a probability histogram
of the RV S
c) On the same figure, plot the normal distribution probability function and
compare the probability histogram with the plot of f x( )
2
2
1 ( ) ( ) exp{ } 2 2
S
S S
x f x µ
σ π σ
− = −
d) Make n = 5 and repeat steps (a)-(c)
e) Make n =15 and repeat steps (a)-(c)
SUBMIT a report following the guidelines as described in the syllabus.
The report should include, among the other requirements:
• The above tables
EE 381 Project: Central Limit Theorem Dr. Chassiakos -- Page 9
• The histograms for 𝑛𝑛 = {1 ,5, 15} and the overlapping normal probability
distribution plots.
• The Python code, included in an Appendix.
• Make sure that the graphs are properly labeled.
An example of creating the PDF graph for n = 2 is shown below.
The code below provides a suggestion on how to generate a bar graph for a
continuous random variable X , which represents the total bookwidth for
n = 2 , and a = 1 , b = 3.
Note that the value of ”barwidth” is adjusted as the number of bins
changes, to provide a clear and understandable bar graph.
Also note that the ”density=True” parameter is needed to ensure that the
total area of the bargraph is equal to 1.0.
EE 381 Project: Central Limit Theorem Dr. Chassiakos -- Page 10
# The code provides a way to create the bar graph at the end
# for n=2 and a=1, b=3
# This is only a sample code. Your project has different values
# of a and b. You must use the correct values for your project
#
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
# Generate the values of the RV X
N=100000; nbooks=2; a=1; b=3;
mu_x=(a+b)/2 ; sig_x=np.sqrt((b-a)**2/12)
X=np.zeros((N,1))
for k in range(0,N):
x=np.random.uniform(a,b,nbooks)
w=np.sum(x)
X[k]=w
# Create bins and histogram
nbins=30; # Number of bins
edgecolor='w'; # Color separating bars in the bargraph
#
bins=[float(x) for x in np.linspace(nbooks*a, nbooks*b,nbins+1)]
h1, bin_edges = np.histogram(X,bins,density=True)
# Define points on the horizontal axis
be1=bin_edges[0:np.size(bin_edges)-1]
be2=bin_edges[1:np.size(bin_edges)]
b1=(be1+be2)/2
barwidth=b1[1]-b1[0] # Width of bars in the bargraph
plt.close('all')
# PLOT THE BAR GRAPH
fig1=plt.figure(1)
plt.bar(b1,h1, width=barwidth, edgecolor=edgecolor)
#PLOT THE GAUSSIAN FUNCTION
def gaussian(mu,sig,z):
f=np.exp(-(z-mu)**2/(2*sig**2))/(sig*np.sqrt(2*np.pi))
return f
f=gaussian(mu_x*nbooks,sig_x*np.sqrt(nbooks),b1)
plt.plot(b1,f,'r')
EE 381 Project: Central Limit Theorem Dr. Chassiakos -- Page 11
3. Distribution of the Sum of Exponential RVs
This problem involves a battery-operated critical medical monitor. The lifetime (T )
of the battery is a random variable with an exponentially distributed lifetime. A
battery lasts an average of β days, which has been provided to you. Under these
conditions, the PDF of the battery lifetime is given by:
1 1 exp( ), 0 (; )
0, 0
T
t t
f t
t
β β β
− ≥ =
<
As mentioned before, the mean and variance of the random variable T are:
𝜇𝜇𝑇𝑇 = 𝛽𝛽 ; 𝜎𝜎𝑇𝑇 = 𝛽𝛽
When a battery fails it is replaced immediately by a new one. Batteries are
purchased in a carton of 24. The objective is to simulate the RV representing the
lifetime of a carton of 24 batteries, and create a histogram. To do this, follow the
steps below.
a) Create a vector of 24 elements that represents a carton. Each one of the 24
elements in the vector is an exponentially distributed random variable (T ) as
shown above, with mean lifetime equal to β . Use the same procedure as in the
previous problem to generate the exponentially distributed random variable T .
b) The sum of the elements of this vector is a random variable (C ), representing
the life of the carton, i.e.
CTT T =++ 1 2 24
where each , 1,2, 24 T j j = is an exponentially distributed R.V. Create the R.V.
C , i.e. simulate one carton of batteries. This is considered one experiment.
c) Repeat this experiment for a total of N=10,000 times, i.e. for N cartons. Use the
values from the N=10,000 experiments to create the experimental PDF of the
lifetime of a carton, f c( ).
d) According to the Central Limit Theorem the PDF for one carton of 24 batteries
can be approximated by a normal distribution with mean and standard deviation
given by:
24 24 ; 24 24 µ µ β σσ β C T = = C T = =
Plot the graph of a normal distribution with
mean = µC and (standard deviation) = σ C ,
over plot of the experimental PDF on the same figure, and compare the results.
e) Create and plot the CDF of the lifetime of a carton, F c( ) . To do this use the
Python "numpy.cumsum" function on the values you calculated for the
experimental PDF. Since the CDF is the integral of the PDF, you must multiply
the PDF values by the barwidth to calculate the areas, i.e. the integral of the
PDF.
If your code is correct the CDF should be a nondecreasing graph, starting at 0.0 and
ending at 1.0.
EE 381 Project: Central Limit Theorem Dr. Chassiakos -- Page 12
Answer the following questions:
1. Find the probability that the carton will last longer than three years, i.e.
P S( 3 365) 1 ( 3 365) 1 (1095) × =− ≤× =− P S F . Use the graph of the CDF F t( ) to
estimate this probability.
2. Find the probability that the carton will last between 2.0 and 2.5 years (i.e.
between 730 and 912 days): PS F F (730 912) (912) (730) ≤≤ = − . Use the graph of
the CDF F t( ) to estimate this probability.
3. SUBMIT a report following the guidelines as described in the syllabus.
The report should include, among the other requirements:
• The numerical answers using the table below. Note: You will need to replicate the
table, in order to provide the answer in your report. Points will be taken off if you do not use
the table.
• The PDF plot of the lifetime of one carton and the corresponding normal
distribution on the same figure.
• The CFD plot of the lifetime of one carton
• Make sure that the graphs are properly labeled.
• The code in an Appendix.
QUESTION ANS.
1. Probability that the carton will last longer than three years
2. Probability that the carton will last between 2.0 and 2.5 years