You are on page 1of 63

Inferences On The Mean And

Variance of A Distribution
Tieming Ji
Fall 2012
1 / 63
Whats in this chapter? Estimation under more realistic
situations and some hypothesis testing.
How to get a condence interval (C.I.) of
2
.
How to get a condence interval of when
2
is unknown.
We have already learnt C.I. construction for when
2
is
known.
Fundamental concepts of hypothesis testing.
Hypothesis test on the mean (
2
is know or unknown).
Hypothesis test on the variance
2
( is unknown).
Other often used tests.
2 / 63
Interval Estimation of Variance
3 / 63
Point estimation of
2
is S
2
.
To get an interval estimation of
2
, we look at the distribution
of estimator S
2
.
Theorem: Let X
1
, X
2
, , X
n
be a random sample from a
normal distribution with mean and variance
2
. Then
(n 1)S
2

2

2
(n1)
.
4 / 63
Density of a
2
5
distribution
The red vertical lines are
2
5,0.975
and
2
5,0.025
cutos.
P
_
X
2
5,0.975
_
= 0.975
P
_
X
2
5,0.025
_
= 0.025
0 5 10 15
0
.
0
0
0
.
0
5
0
.
1
0
0
.
1
5
X
f
(
X
)
5 / 63
By look at a
2
-table, we could nd a C.I. for
2
as follows.
P
_

2
n1,1/2

(n 1)S
2

2

2
n1,/2
_
= 1
P
_
1

2
n1,/2


2
(n 1)S
2

1

2
n1,1/2
_
= 1
P
_
(n 1)S
2

2
n1,/2

2

(n 1)S
2

2
n1,1/2
_
= 1
Thus, for a C.I. for
2
with condence level (100(1 )%),
L
1
=
(n1)S
2

2
n1,/2
and L
2
=
(n1)S
2

2
n1,1/2
.
6 / 63
Based on the previous derivations, we have the following
theorem.
Theorem: Let X
1
, X
2
, , X
n
be a random sample of size n
from a normal distribution with mean and variance
2
. The
condence interval with condence level 100(1 )% for
2
is
_
L
1
=
(n 1)S
2

2
n1,/2
, L
2
=
(n 1)S
2

2
n1,1/2
_
.
7 / 63
Example 1: A random sample with n=10 and S = 0.7 is taken
from a normal population. Find a 90% C.I. for .
Solution:
= 0.1 /2 = 0.05.
The degree of freedom is n 1 = 9.
From the
2
-table, we have
2
9,0.05
= 16.9, and
2
9,0.95
= 3.33.
For
2
, its 90% C.I. is
_
L
1
=
(n1)S
2

2
n1,/2
, L
2
=
(n1)S
2

2
n1,1/2
_
=
_
9(0.7)
2
16.9
,
9(0.7)
2
3.33
_
= [0.261, 1.324].
Thus, for , the 90% C.I. is
[

0.261,

1.324] = [0.511, 1.151].


8 / 63
Example 2: Highway engineers have found that the ability to
see and read a sign at night depends in part on its surround
luminance. That is, it depends on the light intensity near the
sign. These data are obtained on the surround luminance (in
candela per square meter) of 30 randomly selected highway
signs in a large metropolitan area.
10.9 1.7 9.5 2.9 9.1 3.2
9.1 7.4 13.3 13.1 6.6 13.7
1.5 6.3 7.4 9.9 13.6 17.3
3.6 4.9 13.1 7.8 10.3 10.3
9.6 5.7 2.6 15.1 2.9 16.2
We have S
2
= 20.43,
2
29,0.05
= 42.6, and
2
29,0.95
= 17.7.
Find a 90% condence interval on the standard deviation in
surround luminance.
9 / 63
Solution:
The sample size n = 30.
S
2
=
n

n
i =1
x
2
i
(

n
i =1
x
i
)
2
n(n1
=
30(2821.56)(258.6)
2
30(301)
20.43.
= 0.1. We have
2
29,0.05
= 42.6, and
2
29,0.95
= 17.7
The 90% condence interval for
2
is
_
(n 1)S
2

2
n1,/2
,
(n 1)S
2

2
n1,1/2
_
=
_
29(20.43)
42.6
,
29(20.43)
17.7
_
= [13.91, 33.47].
Thus, the 90% condence interval for is
[

13.91,

33.47] = [3.73, 5.79].


10 / 63
Interval Estimation of Mean When Variance is
Unknown.
11 / 63
When X follows a normal distribution, and
2
is known, we
have

X N(,
2
/n) or

X
/

n
N(0, 1).
When X follows a normal distribution, and
2
is unknown, we
have

X
S/

n
t
n1
.
X t
n
: X follows a t distribution with degrees of freedom n.
12 / 63
Gosset discovered t-distribution in 1908 when he was working
at a brewery. t-distribution comes from his pen name
student.
4 2 0 2 4
0
.
0
0
.
1
0
.
2
0
.
3
0
.
4
x
f
(
x
)
Normal
t(2)
t(5)
t(10)
13 / 63
Properties of t
n
distribution:
t
n
denotes the t distribution with degree of freedom n. n
could be any positive integer.
Density of t
n
distribution has a bell-shaped curve,
symmetric around 0.
Compared to the normal distribution, t
n
has heavier tails.
As n becomes bigger, t distribution converges to the
standard normal distribution.
14 / 63
t distribution is a derived distribution.
Theorem: Let Z be a standard normal random variable and let

2
n
be a chi-square random variable with degrees of freedom n.
Then, the random variable
t =
Z
_

2
n
/n
follows a t distribution with n degrees of freedom.
15 / 63

X
S/

n
?

X
S/

n
=

X
/

n
_
_
(n1)S
2

2
_
/(n 1)
Numerator:

X
/

n
= Z N(0, 1).
Denominator:
(n1)S
2

2

2
n1
.
Thus,

X
S/

n
t
n1
16 / 63
Condence Interval on mean when variance is unknown.
Theorem: Let X
1
, X
2
, , X
n
be a random sample for X from
a normal distribution, or a sample with size n large enough. If
the mean and variance
2
are all unknown. A 100(1-)%
condence interval on is given by

X t
(n1),/2
S/

n.
17 / 63
Example: Sulfur dioxide and nitrogen oxide are both products
of fossil fuel consumption. These compounds can be carried
long distances and converted to acid before being deposited in
the form of acid rain. These data were obtained on the
sulfur dioxide concentration (in micrograms per cubic meter) in
a Bavarian forest thought to have been damaged by acid rain:
52.7 43.9 41.7 71.5 47.6 55.1
62.2 56.5 33.4 61.8 54.3 50.0
45.3 63.4 53.9 65.5 66.6 70.0
52.4 38.6 46.1 44.4 60.7 56.4
Give a 95% C.I. on the mean sulfur dioxide concentration in
this forest.
18 / 63
Solution: Based on the 24 observations, we have
x 53.92, s
2
101.48, s 10.07.
The bounds for the condence interval are
x t
n1,/2
s/

n = 53.92 2.069(10.07)/

24.
Thus, we are 95% condent that the mean sulfur dioxide
concentration in this forest lies in the interval [49.67, 58.17].
19 / 63
Exercise: The super gopher is a device invented to drill
through arctic pack ice. It is a cone-shaped apparatus 5 feet
high, 4 feet wide, and wound with a copper coil. Water heated
to 180

F is pumped through the coil. This allows the gopher


to melt a vertical round shaft through the ice. Let X denote
the distance or depth that the gopher can drill per hour, and
assume it follows a normal distribution. These data are
obtained on 10 test holes (depth is in feet):
2.0 1.7 2.6 1.5 1.4
2.1 3.0 2.5 1.8 1.4
Find a 90% condence interval on the average distance that
can be drilled in an hour.
20 / 63
Solution: Let X denote the random variable associated with
the distance that can be drilled in an hour. The random
sample has n = 10 observations. Based on the sample, we
have
x = 2, s
2
=
10(42.72) 20
2
10(9)
= 0.302, s = 0.55.
The bounds for the 90% condence interval are
x t
n1,/2
s/

n = 2 1.833(0.55)

10 = 2 0.32.
We are 90% condent that the average distance that can be
drilled in an hour is between 1.68 and 2.32 feet.
21 / 63
Concepts of Hypothesis Testing
22 / 63
Examples:
1. A coin with probability p to get a head. From the past
experience, we guess p is 0.3. We want to test if this is true.
Test if p = 0.3 or p = 0.3.
H
0
: p = 0.3 vs. H
1
: p = 0.3.
2. A recent survey of 20 families at England found that the
birth rate for a boy child is more than 0.5. We want to test if
this is true.
Test if p > 0.5 or p 0.5.
H
0
: p 0.5 vs. H
1
: p > 0.5.
23 / 63
In a hypothesis test, we have
H
0
: Null hypothesis; vs. H
1
(or H
a
) : Alternative hypothesis.
Null space
0
: parameter space dened in H
0
;
Alternative space
1
: parameter space dened in H
1
.

0

1
= .
Question: How to form H
0
and H
1
?
The statement of equality will always be included in H
0
.
The opposite side we want to test is the H
1
.
What if there is no equality or two equalities? Not in this
class.
(Understand this if you can but not required) H
0
often
states something conventional; H
1
often states something
surprising, new or more interesting.
24 / 63
Based on observations (a random sample with size n), we
make a decision to reject H
0
(accept H
1
) or not to reject H
0
.
Actual Situation
Decision H
0
is true H
1
is true
Reject H
0
Wrong Correct
Fail to reject H
0
Correct Wrong
25 / 63
Wrong decisions in hypothesis testing:
Type I Error: Reject H
0
when H
0
is true. The probability to
commit type I error is called the signicance level of the test,
and denoted by .
Type II Error: Fail to reject H
0
when H
0
is not true. The
probability to commit type II error is denoted by .
Power: The power of a test is the probability of rejecting H
0
when H
0
is not true.
P(Reject H
0
|H
0
is not true) +P(Fail to reject H
0
|H
0
is not true)
= Power + = 1
Power = 1-.
26 / 63
Two types of errors and One power.
Actual Situation
Decision H
0
is true H
1
is true
Reject H
0
Wrong Correct
Type I Error Power
(probability ) (probability 1-)
Fail to reject H
0
Correct Wrong
Type II Error
(probability )
Requirement: Given the setting, calculate , and power of
the test. See following examples.
27 / 63
Example: From a survey in England based on a report from 20
families (each with one child, and assume independence
among families), it is thought that the birth rate for a boy
child is more than 50%. Let X denote the number of boy
children among the 20 children. p denotes the probability of
giving a birth to a boy. Then, X follows a Binomial
distribution with n=20 and p.
28 / 63
Question 1: Form the null and the alternative hypotheses.
The null and alternative hypotheses are:
H
0
: p 0.5 vs. H
1
: p > 0.5.
Null space:
0
= {p 0.5}.
Alternative space:
1
= {p > 0.5}.
The true value of p is either in
0
or in
1
(either accept H
0
or reject H
0
).
29 / 63
Question 2: We decided to reject H
0
if we observe X 14
(more than 14 boy children in 20 randomly selected children).
Compute the signicance level of the test if the true
p = 0.5
0
.
= P(Type I Error)
= P(Reject H
0
|p = 0.5)
= P(X 14|p = 0.5)
= 1 P(X 13|p = 0.5)
= 1
13

i =0
_
20
i
_
p
i
(1 p)
20i
= 0.0577
30 / 63
Question 3: Suppose the true p = 0.7
1
(H
1
is true).
Compute the probability of committing the type II error.
= P(Type II Error)
= P(Fail to reject H
0
|p = 0.7)
= P(X 13|p = 0.7)
= 0.3920
Question 4: Suppose the true p=0.7. Compute the power of
the test.
Power = P(RejectH
0
|H
1
is true)
= P(RejectH
0
|p = 0.7)
= 1 P(Fail to reject H
0
|p = 0.7)
= 1
= 0.608
31 / 63
Question 5: Critical region or Rejection region is the set of
values for the testing statistic that will lead to the rejection of
H
0
, and often denoted by C. In this application, we decided to
use X as the testing statistic, and will reject H
0
if X 14.
Write out the rejection, or critical region of the test.
The critical region is C = {14, 15, 16, 17, 18, 19, 20}.
32 / 63
Question 6: If p = 0.5
0
, and we want to limit 0.05,
what would be the critical region?
We know that H
0
: p 0.5 vs. H
1
: p > 0.5. Thus, a large
number of X will lead to rejection of H
0
.
If use X = 14 as the cuto, then
= P(Type I Error) = P(X 14|p = 0.5) = 0.0577.
If use X = 15 as the cuto, then
= P(Type I Error) = P(X 15|p = 0.5) = 0.0207.
Thus, to restrict to be less than 0.05 when the true p=0.5,
the critical region is C = {15, 16, 17, 18, 20}.
33 / 63
Exercise: From the past experience, there are about 70%
passengers checking in their luggage when ying. An airplane
company starts to run a few more short trips. We conjecture
that passengers may want to check in luggage less likely if
their journey is short. We collect observations from 10
passengers. Suppose each passenger checks in ones luggage
with probability p, then the total number of checkins X follows
Binomial distribution with n=10 and p.
Question 1: Form the null and the alternative hypotheses.
Question 2: We reject H
0
and accept H
1
if we observe 5 or
less passengers checking in. If the true p = 0.8
0
, nd the
probability of making type I error. Can we make type II error?
Question 3: Find the critical region for an = 0.05 level test.
If x = 5, will H
0
be rejected?
34 / 63
Question 1: Form the null and the alternative hypotheses.
H
0
: p 0.7, H
1
: p < 0.7.
Question 2: We reject H
0
and accept H
1
if we observe 5 or
less passengers checking in. If the true p = 0.8
0
, nd the
probability of making type I error. Can we make type II error?
= P(Reject H
0
|p
0
)
= P(X 5|p = 0.8)
=
5

i =0
_
10
i
_
p
i
(1 p)
10i
= 0.0328
We make type II error if we fail to reject H
0
when H
1
is true.
However, p=0.8 is the case where H
0
is true (p
0
) and H
1
is wrong (p
1
). Thus, no, we cannot make type II error.
35 / 63
Question 3: Find the critical region for an = 0.05 level test
when true p = 0.8. If x = 5, will H
0
be rejected?
The hypotheses are: H
0
: p 0.7, H
1
: p < 0.7. Thus, a small
observation will favor H
1
and lead to rejection of H
0
.
From question 2, if we choose X=5 as the cuto, we have
= P(X 5|p = 0.8) = 0.0328 < 0.05.
If we choose X=6 as the cuto, we have
= P(X 6|p = 0.8) = 0.1209 > 0.05.
Thus, if we want to restrict the type I error rate (level of
signicance) to be less or equal to 0.05, the critical region is
C = {1, 2, 3, 4, 5}.
36 / 63
Hypothesis Tests On The Mean
37 / 63
All forms of hypotheses on the mean of a distribution:
H
0
:
0
H
0
:
0
H
0
: =
0
H
1
: >
0
H
1
: <
0
H
1
: =
0
Right-tailed test Left-tailed test Two-tailed test
38 / 63
Example 1: The maximum acceptable level for exposure to
microwave radiation in the United States is an average of 10
microwatts per square centimeter. It is feared that a large
television transmitter may be polluting the air nearby by
pushing the level of microwave radiation above the safe limit.
H
0
: 10; H
1
: > 10. (unsafe)
39 / 63
Example 2: Design engineers are working on a low-eort
steering system that can be used in vans modied to t the
needs of disabled drivers. The old-type steering system
required a force of 54 ounces to turn the vans
15-inch-diameter steering wheel. It is hoped that the new
design will reduce the average force required to turn the wheel.
In this case we are testing
H
0
: 54; H
1
: < 54. (new system requires less force)
40 / 63
Example 3: A computer system currently has 10 terminals and
uses a single printer. The average turnaround time for the
system is 15 minutes. Ten new terminals and a second printer
are added to the system. We want to determine whether or
not the mean turnaround time is aected. To decide, we want
to test
H
0
: = 15; H
1
: = 15. (new equipment has an impact on
turnaround time)
41 / 63
All forms of hypotheses on the mean of a distribution:
H
0
:
0
H
0
:
0
H
0
: =
0
H
1
: >
0
H
1
: <
0
H
1
: =
0
Right-tailed test Left-tailed test Two-tailed test
Logic : (Very Important)
We want to control type I error (signicance level) rate in a
test. So that pre-dened threshold, for example,
= P(type I error) = 0.05.
= P(type I error) = P(Reject H
0
|
0
)
42 / 63
We want to control
= P(type I error) = P(Reject H
0
|
0
) pre-dened value.
We cannot compute (type I error rate) if we do not know
the true parameter value in the
0
. What to do?
When =
0
(the equality value or boundary value), is
maximized for any value of
0
. Thus, we assume =
0
when controlling . Why? Because, in this way, for any

0
, pre-dened threshold for any
0
.
43 / 63
Based on the logic stated in the previous slides, when we try
to control the type I error rate to decide if we want to reject
H
0
or not, to test
H
0
:
0
H
0
:
0
H
0
: =
0
H
1
: >
0
H
1
: <
0
H
1
: =
0
Right-tailed test Left-tailed test Two-tailed test
will have the same type I error rate (rejection region, decision)
with
H
0
: =
0
H
0
: =
0
H
0
: =
0
H
1
: >
0
H
1
: <
0
H
1
: =
0
Right-tailed test Left-tailed test Two-tailed test
44 / 63
Example 1: The maximum acceptable level for exposure to
microwave radiation in the United States is an average of 10
microwatts per square centimeter. It is feared that a large
television transmitter may be polluting the air nearby by
pushing the level of microwave radiation above the safe limit.
The hypotheses are H
0
: 10, and H
1
: > 10. Suppose we
took a random sample of size n = 25, we observed x = 10.3
and s = 2. Test if we want to reject H
0
with no more than
0.05 probability to make a wrong decision to reject H
0
.
45 / 63
H
0
: 10, and H
1
: > 10
Solution:
First of all, we assume H
0
is true. Under this assumption:
If

X is large, we will favor H
1
and reject H
0
.
If

X ( = 10) is large, we will favor H
1
and reject H
0
.
If

X
S/

n
( = 10) is large, we will favor H
1
and reject H
0
.
We know the distribution of

X
S/

n
t
n1
.
So we want to nd the critical value c such that
= P
_

X
S/

n
> c

= 10
_
0.05.
c = t
n1,0.05
. This is to say, we will reject H
0
, if and only if
x10
s/

n
> t
n1,0.05
. In this way, 0.05.
(Remind: (1) P(t > t
n1,0.05
) = 0.05, t
n1,0.05
is the t-score
for the upper 0.05 tail. (2) P(t < t
n1,0.05
) = 0.95)
46 / 63
We will reject H
0
, if and only if
x10
s/

n
> t
n1,0.05
. Since
x = 10.3, s = 2 and n = 25, we have
x10
s/

n
=
10.310
2/

25
= 0.75.
t
251,0.05
= 1.711.
Thus, with the signicance level controlled at the 0.05 level,
we cannot reject H
0
.
This example tells us: when testing , we use
x
s/

n
as the test
statistic (
2
unknown), because we know the distribution of
x
s/

n
, which helps to nd the rejection region when controlling
type I error rate.
47 / 63
Example 2: A computer system currently has 10 terminals and
uses a single printer. The average turnaround time for the
system is 15 minutes. Ten new terminals and a second printer
are added to the system. We want to determine whether or
not the mean turnaround time is aected. To decide, we want
to test
H
0
: = 15, H
1
: = 15.
To test, we take a random sample with size n = 30, we
observed x = 14 and s = 3. Make a decision by controlling
the probability of wrongly rejecting H
0
no more than 0.1 (that
is to say, if we reject H
0
, we have at least 90% condence that
H
0
is indeed not true).
48 / 63
H
0
: = 15, H
1
: = 15.
Solution:
First of all, we assume H
0
is true. Under this assumption:
If

X is either too small or too big compare to 15, reject H
0
.
If |

X | ( = 15) is much bigger than 0, reject H


0
.
If


X
S/

( = 15) is much bigger than 0, reject H


0
.
In order to control the type I error rate at the 0.1 level, this is
to nd the critical value c such that
= P
_

X
S/

> c| = 15
_
0.1.
Since

X
s/

n
t
n1
, c = t
n1,0.05
= 1.699. x = 14, s = 3,
n = 30, plugging in, we have

x
s/

= 1.83 > c. Thus, we


reject H
0
and accept H
1
with 90% condence.
49 / 63
Example 3: The Elbe River is important in the ecology of
central Europe, as it drains much of this region. Due to
increased industrialization, it is feared that the mineral content
in the soil is being depleted. This will be reected in an
increase in the level of certain minerals in the water of the
Elbe. A study of the river conducted in 1982 indicated that
the mean silicon level was 4.6 mg/l. The hypotheses we want
to test are:
H
0
: = 4.6 mg/l, H
1
: > 4.6 mg/l.
Suppose a random sample of size 28 yields x = 5.2, and
standard deviation is known to be = 1.6. What is the
decision of the test if we want to control the signicance level
at 0.05?
50 / 63
H
0
: = 4.6 mg/l, H
1
: > 4.6 mg/l.
Solution:
Under the assumption that H
0
is true:
If

X is much bigger than 4.6, reject H
0
.
If

X 4.6 is much bigger than 0, reject H
0
.
If

X4.6
/

n
is much bigger than 0, reject H
0
.
By controlling the type I error ate at 0.05 level, this is to nd
the critical value c such that
P
_

X
/

n
> c | = 4.6
_
0.05.
Since

X
/

n
N(0, 1), c = z
0.05
= 1.65. Plugging in the
observed values, we have

X
/

n
=
5.24.6
1.6/

28
= 1.98 > c. Thus,
we reject H
0
.
51 / 63
Hypothesis test on :
When
2
is known, use test statistic

X
/

n
and z-test;
When
2
is unknown, use test statistic

X
S/

n
and t-test.
Assume H
0
is true:
In right-tailed test, too big test statistic leads to rejection;
In left-tailed test, too small test statistic leads to rejection;
In two-tailed test, either too big or too small test statistic
leads to rejection.
Critical value (c): the threshold to decide if to reject H
0
or not.
c is decided by the level.
Why control (type I error) instead of (type II error)?
52 / 63
Hypothesis Tests On The Variance
53 / 63
Dierent forms of hypothesis tests on the variance of a
distribution.
H
0
:
2

2
0
H
0
:
2

2
0
H
0
:
2
=
2
0
H
1
:
2
>
2
0
H
1
:
2
<
2
0
H
0
:
2
=
2
0
Right-tailed test Left-tailed test Two-tailed test
The testing procedures are the same with the following three
forms, respectively.
H
0
:
2
=
2
0
H
0
:
2
=
2
0
H
0
:
2
=
2
0
H
1
:
2
>
2
0
H
1
:
2
<
2
0
H
0
:
2
=
2
0
Right-tailed test Left-tailed test Two-tailed test
54 / 63
Example: Engineers designing the front-wheel-drive half-shaft
claim that < 1.5 in the displacement of the shaft. Based on
the given n=20 observations, s = 1.41 millimeters. Do these
data support the contention of the engineers?
This is to test:
H
0
: = 1.5, H
1
: < 1.5.
This is equivalent to test
H
0
:
2
= (1.5)
2
, H
1
:
2
< (1.5)
2
.
Think: can we accept H
1
because s = 1.41 < 1.5? No.
Why?
55 / 63
H
0
:
2
= (1.5)
2
, H
1
:
2
< (1.5)
2
.
Solution:
Assume the null hypothesis is true (
2
= 1.5
2
):
If S
2
is much smaller than (1.5)
2
, reject H
0
.
If
S
2

2
is much smaller than 1, reject H
0
.
If
(n1)S
2

2
is much smaller than (n-1), reject H
0
.
We know:
(n1)S
2

2

2
n1
. Find the critical value c such that
P
_
(n 1)S
2

2
< c| = 1.5
_
0.05.
c =
2
201,0.05
= 10.1. We observed
(n1)s
2

2
= 16.79 > 10.1.
Thus, we are unable to reject H
0
.
56 / 63
Hypothesis test Condence interval:
Relationship: A two-tailed test uses a reversed procedure for
constructing a condence interval.
For example: Test H
0
: =
0
, H
1
: =
0
when
2
is known.
Hypothesis test: Fail to reject H
0
at level if

X
0
/

z
/2
.
That is, fail to reject H
0
if
_
z
/2


X
0
/

n
z
/2
_
.
That is, fail to reject H
0
if
_

X z
/2
/

n
0


X + z
/2
/

n
_
.
This is the Condence interval for . This is to say, if the
(1 ) level C.I. for does not cover the boundary value
0
in the hypothesis test, we will reject H
0
at level .
57 / 63
Other Testing Methods:
Sign Test and Wilcoxon Signed-Rank Test
Idea is the same with parametric methods:
If the null hypothesis is true, how likely the observed testing
statistic was to have been sampled from the null distribution.
If this probability is too small ( level), then we reject H
0
and
accept H
1
. Otherwise, accept H
0
.
58 / 63
Sign Test for Median
Example: A standard method for completing a task on an
assembly line yields a median completion time of 55 seconds.
A new procedure is developed that should reduce the median
time required. Use M to denote the median. We want to test
H
0
: M = 55, H
1
: M < 55.
To do so, 15 subjects are asked to complete the task, and
these observations are obtained on the random variable X, the
time required: 35, 65, 48, 40, 70, 50, 58, 36, 47, 41, 49, 39,
34, 33, 31.
59 / 63
Solution: Q is the testing statistic for the sign test, and Q =
(the # of positive dierences of X
i
M among all
observations). Under the null hypothesis, there should be half
chance for X to fall below 55 and half chance to fall above 55.
Thus, X
i
M should be half positive and half negative if null
hypothesis is true, that Q Binomial(n, 1/2). Among the 15
observations, Q = 3, that is, only 3 observations are positive.
We have
P(Q 3|M = 55) =
3

i =0
_
15
i
_
(1/2)
i
(1/2)
15i
= 0.0176.
If H
0
is true, there is only 0.0176 probability to see an
observation no more than 3. Thus, at the level of = 0.05,
we reject H
0
, and conclude that the median is less than 55.
60 / 63
Wilcoxon Signed-Rank Test
Example: The melting point for a new lightweight material
designed for use in automobile interiors is being investigated.
It is known that due to impurities in the material, the melting
point is a random variable uniformly distributed over a small
temperature interval. It is thought that the median melting
point is less than 120

C. Do these data support this


contention?
Observations: 115.1, 117.8, 116.5, 121.0, 120.3, 119.0, 119.8,
118.5.
61 / 63
Solution: We want to test H
0
: M = 120 vs. H
1
: M < 120.
x
i
115.1 117.8 116.5 121.0 120.3 119.0 119.8 118.5
x
i
120 -4.9 0.3 -2.2 -1.0 -3.5 -0.2 1.0 -1.5
|x
i
120| 4.9 0.3 2.2 1.0 3.5 0.2 1.0 1.5
Rank 8 2 6 3.5 7 1 3.5 5
We use the test statistic W
+
=

rank of positive (X
i
M)
R
i
. If
W
+
is too small, we favor H
1
and reject H
0
. We have
W
+
= 2 + 3.5 = 5.5.
(1) When n is small, use Wilcoxon table to nd critical value.
(2) When n is large, we have
W
+
N(
n(n + 1)
4
,
n(n + 1)(2n + 1)
24
).
62 / 63
Chapter Summary
Condence intervals for and
2
;
Concepts of hypothesis testing: two types of errors and
one power; rejection region; critical value.
Hypothesis tests on and
2
given a xed signicance
level.
63 / 63

You might also like