You are on page 1of 44

1

INSE 6220 -- Week 3


Advanced Statistical Approaches to Quality

Descriptive Statistics
Discrete Probability Distributions
Continuous Probability Distributions

Dr. A. Ben Hamza Concordia University


2
Random Variables and Probability Density Functions
A random variable is a quantity whose value is not known exactly but its probability distribution is known. The
value of the random variable will vary from trial to trial as the experiment is repeated. The variables
probability density function (PDF) describes how these values are distributed (i.e. it gives the probability that
the variable value falls within a particular interval).
Continuous PDFs
f(x) All values between 0 f(x)
and 1 are equally likely Smallest values
are most likely

0 1 0
Uniform distribution x Exponential distribution x
(e.g. soil texture) (e.g. event rainfall)
0.3
f (x)

0.2 0.25 Probability that x = 2


A Discrete PDF 0.15
0.1 Only discrete
values (integers)
are possible
0 1 2 3 4 x
Discrete distribution
(e.g. number of severe storms)
3

Sometimes called a probability Sometimes called a probability


mass function density function
4

Practice Problem
The number of patients seen in the ER in any given hour is a random
variable represented by X. The probability distribution for X is:

X 10 11 12 13 14
P(X=x) .4 .2 .2 .1 .1

Find the probability that in a given hour:


a. exactly 14 patients arrive P(X=14)= .1
b. At least 12 patients arrive P(X12)= (.2 + .1 +.1) = .4

c. At most 11 patients arrive P(X11)= (.4 +.2) = .6


5

Mean of a Random Variable


6
Variance of a Random Variable
The (population) variance of random variable (RV) gives an idea of how
widely spread the values of the RV are likely to be. It is the second moment of
the distribution, indicating how closely concentrated around the expected
value of the distribution is. The variance is defined by
Var ( X ) E ( X 2 ) ( E ( X )) 2 2
The variance is a measure of risk. The variance examines the differences
between each outcome and the expected value.

Var ( X ) is referred to as the standard deviation

Var( X ) E (( X E ( X )) 2 ) pi ( xi E ( X )) 2
i

0.3(50 230) 2 0.2(200 230) 2 0.5(350 230) 2


xi 50 200 350

17,100 2 pi 0.3 0.2 0.5

17,100 130.77
7
Sample mean and sample variance

Mean

variance

s s2
8
Binomial Distribution

Bernoulli Trial:
Bernoulli trial is an experiment with only two possible outcomes.
The two possible outcomes are labeled:
success (s) and failure ( f )

The probability of success is P(s)=p and the probability of failure is


P( f )= q = 1p.

Examples:
1. Tossing a coin (success=H, failure=T, and p=P(H))
2. Inspecting an item (success=defective, failure=non-defective,
and p=P(defective))
9
Binomial Probability Distribution
Our interest is in the number of successes occurring in the n trials.
We let x denote the number of successes occurring in the n trials.

Number of Experimental Outcomes Providing Exactly x Successes in n


Trials

n n!
x
x !(n x )!

where: n! = n(n 1)(n 2) . . . (2)(1)


0! = 1
10
The probability distribution of X is given by:
n x n x
p (1 p ) ; x 0, 1, 2, , n
f ( x) P ( X x ) b( x; n, p ) x
0 ; otherwise

f(x) = the probability of x successes in n trials


n = the number of trials
p = the probability of success on any one trial
11
Example
Bart is a student taking a statistics course. Barts exam strategy is to rely on luck for
the next quiz. The quiz consists of 10 multiple-choice questions. Each question has
five possible answers, only one of which is correct. Bart plans to guess the answer to
each question.
1) What is the probability that Bart gets no answers correct?
2) What is the probability that Bart gets two answers correct?

Algebraically we have: n=10, and P(success) = 1/5 = 0.2


1) # success, x = 0; hence we want to know P(x=0)

Bart has about an 11% chance of getting no answers correct


using the guessing strategy.
12

Binomal density function


The discrete binomial pdf
n!
f ( x) p x (1 p ) n x
x !(n x)!
assigns probability to the event of x successes in n trials of a Bernoulli process
(such as coin flipping) with probability p of success at each trial.

p = 0.2; % Probability of success for each trial


n = 10; % Number of trials
x = 0:n; % Outcomes
fx = pdf('bino',x,n,p); % pmf
stem(x,fx,'LineWidth',2) ; % Visualize the pmf
13

Binomial Probability Distribution


Expected Value
E(X) = = np
Variance
Var(X) = 2 = np(1 - p)
Standard Deviation
SD( x ) np (1 p)
14
Example: Statistical Quality Control
A manufacturing process produces thousands of semiconductor chips per day. On the
average, 1% of these chips do not conform to specifications. Every hour, an inspector
selects a random sample of 25 chips and classifies each chip in the sample as conforming
or nonconforming. If we let X be the random variable representing the number of
nonconforming chips in the sample, then the probability distribution of X is

25
f ( x ) (0.01) x (0.99) 25 x x 0,1, 2,..., 25
x
We may calculate the probability of finding one or fewer nonconforming parts in the
sample as:
P ( X 1) P ( X 0) P ( X 1)
f (0) f (1)
1
25
(0.01) x (0.99) 25 x
x 0 x

25! 25!
(0.01)0 (0.99) 25 (0.01)1 (0.99) 24
0!25! 1!24!
0.7778 0.1964
0.9742
15
Poisson Probability Distribution
The Poisson distribution is
e x
f ( x) x 0,1, 2,...
x!
Where the parameter >0 is the mean number of successes in the interval.
The mean and variance of the Poisson distribution are
and 2
16

Example: Mercy Hospital


Using the Poisson Probability Function
Patients arrive at the emergency room of Mercy Hospital at the
average rate of 6 per hour on weekend evenings. What is the
probability of 4 arrivals in 30 minutes on a weekend evening?

= 6/hour = 3/half-hour, x = 4

34 (2.71828)3
f (4) .1680
4!
17
Example
The number of typographical errors in new editions of textbooks varies
considerably from book to book. After some analysis an instructor concludes
that the number of errors is Poisson distributed with a mean of 1.5 per 100
pages. The instructor randomly selects 100 pages of a new book. What is the
probability that there are no typos?

That is, what is P(X=0) when = 1.5?

e x e 1.51.50
f (0) P ( X 0) 0.2231
x! 0!

There is about a 22% chance of finding zero errors


18

Uniform Distribution
19
Example:
If X is a uniform random variable in the interval [a, b], find the
expectation, E[X] and the variance, Var (X).

Answer:

E[ X ] xf X ( x ) dx E[ X ] x 2 f X ( x ) dx
2

x x2

b

b
dx dx
a
ba a
ba
b2 a2 b3 a 3

2 (b a ) 3(b a )
ba b 2 ab a 2

2 3
Var ( X ) E[ X 2 ] E[ X ]2
b 2 ab a 2 ( a b) 2

3 4
(b a ) 2

12
20
Normal Probability Distribution
The normal probability distribution is the The normal distribution is
most important distribution for describing 1
( x )2

f ( x) e 2 2
x
a continuous random variable. 2
It has been used in a wide variety of with mean and variance 2
applications: The normal distribution is: X N ( , 2 )
Heights and weights of people
The visual appearance of the normal
Test scores
distribution is a symmetric, unimodal or
Scientific measurements bell-shaped curve as shown in the figure.
Amounts of rainfall
It is widely used in statistical inference
21

Normal Probability Distribution


Characteristics of the Normal
Probability Distribution
The total area under the curve is 1 (.5 to
the left of the mean and .5 to the right).
Probabilities for the normal random
variable are given by areas under the
curve.
Characteristics of the Normal
Probability Distribution
68.26% of values of a normal random
variable are within +/- 1 standard
deviation of its mean.
95.44% of values of a normal random
variable are within +/- 2 standard
deviations of its mean.
>> cdf('normal',1,0,1)-cdf('normal',-1,0,1) =0.6827
99.72% of values of a normal random
variable are within +/- 3 standard >> cdf('normal',2,0,1)-cdf('normal',-2,0,1) =0.9545
deviations of its mean. >> cdf('normal',3,0,1)-cdf('normal',-3,0,1) =0.9973
22
Calculating Normal Probabilities

We can use the following function to convert any normal random variable to a
standard normal random variable

Some advice: always


draw a picture!
23
Calculating Normal Probabilities
P(45 < X < 60) ?
mean of 50 minutes and a
standard deviation of 10 minutes

0
24

Standard Normal Distributions

The normal distribution with parameter


values
is called a standard normal distribution. The
random variable is denoted by Z. The pdf is
1 z2 / 2
f ( z) e z
2
The cdf is z
( z ) P(Z z ) f (u ) du

25

Examples:

(0.76) 0.776373
(1.3) ?
( 3) 1 (3) ?
(3.86) ?
26
Example: If X is normal random variable with = 3 and = 9,
find (a) P(2< X<5) (b) P(X>0)

23 X 3 53
a P(2 X 5) P
3 3 3

1 2 2 1
P Z PZ P Z
3 3 3 3

2 1 2 1
1
3 3 3 3

0.3779

X 3 03
b P( X 0) P P( Z 1)
3 3

1 1 1 0.8413
27
28
Original normal
distribution

Standard normal
distribution
29
Normal Approximation to the
Binomial Distributions

Let X be a binomial rv based on n trials, each


with probability of success p. If the binomial
probability histogram is not too skewed, X may be
approximated by a normal distribution with

np and np (1 p )
x 0.5 np
P ( X x)
np (1 p )

30

Example: At a particular small college the pass rate


of Intermediate Algebra is 72%. If 500 students
enroll in a semester determine the probability that at
most 375 students pass.
np 500(.72) 360
np (1 p ) 500(.72)(.28) 10
375.5 360
P ( X 375) (1.55)
10

= 0.9394
31
The Exponential Random Variable
A random variable X is defined to be exponential random variable (or
say X is exponentially distributed) with positive parameter if its
probability density function is given by:
e x if x 0, 0
f ( x)
0 if x 0

f ( x ) dx e x x
Note: dx e 1
0 0

Thus, f(x) is a probability density function.

The cumulative distribution function:


x
F ( x) P( X x) f (t ) dt 0

0
For x 0, F ( x) 0 dt 0

x
t x
For x 0, F ( x) e t
dt e 1 ex
0 0
1 e x
if x 0
F ( x)
0 if x 0
32
The Exponential Random Variable
Expectation:

E[ X ] xf ( x )dx x e x
dx x de x
0 0

Integration by part:

1 1
E[ X ] xe x
( e x
) dx e x
dx e x

0 0 0 0

Variance:

E[ X 2 ] x 2 f ( x ) dx x 2 e x dx x 2 de x
0 0

Integration by part:
2 2 1 2
E[ X 2 ] x 2 e x ( e x )2 x dx 2 xe dx x e dx
0 0 0 0 2
2
1 1 2
Var [ X ] E[ X ] ( E[ X ]) 2 2
2 2


33
Example
The lifetime of an alkaline battery (measured in hours) is exponentially
distributed with = 0.05

Find the probability a battery will last between 10 & 15 hours

P (10 X 15)
F (15) F (10)
P (10 X 15) e (0.05)(10) e (0.05)(15)
e 0.5 e 0.75
0.1341

There is about a 13%


chance a battery will only last
10 to 15 hours
34

Central Limit Theorem


35

Central Limit Theorem


36
Example
Suppose a population has mean = 8 and standard deviation = 3.
Suppose a random sample of size n = 36 is selected.
What is the probability that the sample mean is between 7.8 and 8.2?
Solution:
Even if the population is not normally distributed, the central limit
theorem can be used (n > 30)
so the sampling distribution of is approximately normal
with mean x = 8
3
and standard deviation x 0.5
n 36
37
Example

Solution (continued):

7.8 - 8 X - 8.2 - 8
P(7.8 X 8.2) P
3 3
36 n 36
P(-0.4 Z 0.4) 0.3108

Population Sampling Standard Normal


Distribution Distribution Distribution .1554
??? +.1554
? ??
? ?
? ? ? Sample Standardize
?
7.8 8.2 -0.4 0.4
z 0 Z
8 X X 8 x
38
Chi-squared distribution
39
Chi-squared distribution
40
Chi-squared distribution
41

Students t-distribution
42

Students t-distribution
43

Students t-distribution: X ~t(n-1)


44
F distribution

You might also like