You are on page 1of 20

Sampling Distribution

Chapter 4

Basic Concepts
The probability distribution of a statistic is called
sampling distribution.

A statistic (e.g. sample mean, sample standard


deviation) is a random variable whose value depends
only on the observed sample and may vary from
sample to sample.

The sampling distribution of a statistic will depend


on the size of the population, the size of the sample,
and the method of choosing the sample.

Basic Concepts

The standard deviation of the sampling


distribution is called the standard error of the
statistic. It tells us the extent to which we
expect the values of the statistic to vary from
different possible samples.

The probability distribution of the sample


mean is called the sampling distribution of the
mean.

Basic Concepts
Consider 4 observations making up the population values
of a random variable X having the probability distribution
1
f ( x)
, x 0,1,2,3
4

Note that =E(X)=3/2 and 2=Var(X)=5/4.


Suppose we list all possible samples of size 2, with
replacement, and for each sample compute for the value
of the sample mean, X .

Basic Concepts
No.
1
2
3
4
5
6
7
8

Sample
0, 0
0, 1
0, 2
0, 3
1, 0
1,1
1, 2
1, 3

No.
9
10
11
12
13
14
15
16

X
0
0.5
1.0
1.5
0.5
1.0
1.5
2.0

Sample
2, 0
2, 1
2, 2
2, 3
3, 0
3, 1
3, 2
3, 3

1.0
1.5
2.0
2.5
1.5
2.0
2.5
3.0

Sampling Distribution of the Sample Means

X 0
P(

X)

1/16

0.5

1.0

1.5

2.0

2.5

3.0

2/16

3/16

4/16

3/16

2/16

1/16

Theorems
.

1. If all possible random samples of size n are


drawn with replacement from a finite
population of size N with mean and
standard deviation , then the sample mean
will have a mean and variance given by:
E (X )

Var ( X )

Theorems
2. If all possible random samples of size n are
drawn without replacement from a finite
population of size N with mean and
standard deviation , then the sample mean
will have mean and variance given by:
E (X )

2
Var ( X )
n

N n

.
N 1

4.2 Central Limit Theorem


If X is the mean of a random sample of size n
taken from a (large or infinite) population with
mean and variance 2, then the sampling
distribution X of is approximately normally
distributed with mean E(X ) and variance

when n is sufficiently large. Hence,


Var ( X )
n
the limiting form of the distribution of
2

4.2 Central Limit Theorem


Note
The normal approximation in the theorem will be
good if n30 regardless of the shape of the
population.
If n<30, the approximation is good only if the
population is not too different from the normal.
If the distribution of the population is normal
then the sampling distribution will also be exactly
normal, no matter how small the size of the
sample.

Example
Suppose a team of biologists has been studying a
particular fishing pond. Let x represent the length of a
single trout taken at random from the pond. Assume x
has a normal distribution with a mean= 10.2 in. and
standard deviation = 1.4 in.
(a) What is the probability that a single trout taken at
random from the pond is between 8 and 12 inches?
(b) What is the probability that the mean length of 5 trout
taken at random is between 8 and 12 inches?

Example
Solution:
(a) What is the probability that a single trout taken at
random from the pond is between 8 and 12
inches?
Given: X~N(10.2, 1.42)
P (8 X 12 ) P 8 10.2 Z 12 10.2
1.4

1.4

P 1.57 Z 1.29
P Z 1.29 P( Z 1.57)

0.9015 0.0582
0.8433

Example
Solution:
b. What is the probability that the mean length of 5
trout taken at random is between 8 and 12 inches?
Given: X~N(10.2, 1.42)
P (8 X 12 )

12 10.2
8 10.2
P
Z

1.4 / 5
1.4 / 5
P 3.51 Z 2.87
P Z 2.87 P( Z 3.51)

0.9979 0.0001
0.9978

Example
Assume that the mean systolic blood pressure of normal
adults is 120 millimeters of mercury (mm Hg) and the
standard deviation is 5.6. Assume the variable is normally
distributed.
a. If an individual is selected, find the probability that the
individuals pressure will be between 120 and 121.8
mm Hg.
b. If a sample of 30 adults is randomly selected, find the
probability that the sample mean will be between 120
and 121.8 mm Hg.

Example
Solution:
a. If an individual is selected, find the probability that
the individuals pressure will be between 120 and
121.8 mm Hg.
Given: X~N(120, 5.62)
P (120 X

121.8 120
120 120
Z
121 .8) P

5.6
5.6

P 0 Z 0.32
PZ 0.32 P( Z 0)

0.6255 0.5000
0.1255

Example
Solution:
b. If a sample of 30 adults is randomly selected, find
the probability that the sample mean will be
between 120 and 121.8 mm Hg.
Given: X~N(120, 5.62)
P (120 X

121.8 120
120 120
121 .8) P 5.6 / 30 Z 5.6 / 30

P0 Z 1.76
PZ 1.76 P ( Z 0)

0.9608 0.5000
0.4608

4.3 Sampling from the Normal Distribution


The t-distribution
If X and S 2 are the mean and variance, respectively, of
a random sample of size n taken from a population
which is normally distributed with mean and
variance 2, then
T

X
s/ n

is a random variable having the t-distribution with


degrees of freedom v=n-1.
Notation:

T ~ t v n 1

4.3 Sampling from the Normal Distribution


Comparison between t-distribution and the standard normal
distribution
1. Both are symmetric about zero.
2. Both are bell-shaped, but the t-distribution is more
variable.
2
X
S
i) t-values depend on the fluctuation of two values: and
ii) z-values depend only on the change of X from sample to
sample
3. When the sample size is large i.e. n30, the t-distribution
can be will approximated by the standard normal
distribution.

4.3 Sampling from the Normal Distribution


Area under the curve
Just like any continuous probability distribution, the
probability that a random sample produces a t-value
falling between any two specified values is equal to the
area under the curve of the t-distribution between any
two ordinates corresponding to the specified values.
Notation: t is the value leaving an area of in the right
tail of the t-distribution. That is, if T ~ t (v ) then t is
such that P(T t ) .
Since the t-distribution is symmetric about zero, t1 t .

Example
1. Find the following values on the t-table:
a) t0.025 when v=14
b) t0.99 when v=10
2. Find k such that P(k<T<2.807)=0.945 when
T~t(23)

Example
3. A manufacturing firm claims that the batteries
used in their electronic games will last an average
of 30 hours. To maintain this average, 16
batteries are tested each month. If the computed
t-value falls between -t0.025 and t0.025, the firm is
satisfied with its claim. What conclusion should
the firm draw from a sample that has a mean of
27.5 hours and standard deviation of 5 hours?
Assume the distribution of battery lives to be
approximately normal.

You might also like