4.hypothesis Testing

Hypothesis Testing
Central Limit Theorem
• As more and more samples are taken from a

population, the distribution of the averages of
the samples follows the normal distribution.
The average of the samples comes arbitrarily
close to the average of the entire population.
• Normal distribution is described by the mean
(average count) and the standard deviation
(clustering around the mean)
P- Values and Q-Values
• The null hypothesis can be quantified

• The p-value is the probability that the null
hypothesis is true
• When the null hypothesis is true, nothing is really
happening; differences are due to chance
• Confidence, the reverse of a p-value, is called the q-
value. p-value = 5% then the q-value (confidence) is
95%.
• Example: Bush/Kerry…p-value 60% or 5%
Nonstatistical Hypothesis Testing…
Critical concepts:
1. There are two hypotheses, the null and the alternative hypotheses.
2. The procedure begins with the assumption that the null hypothesis
is true.
3. The goal is to determine whether there is enough evidence to infer
that the alternative hypothesis is true, or the null is not likely to
be true.
4. There are two possible decisions:
Conclude that there is enough evidence to support the alternative
hypothesis. Reject the null.
Conclude that there is not enough evidence to support the
alternative hypothesis. Fail to reject the null.
11.4
Concept of hypothesis testing
• The two hypotheses are called the null hypothesis and

the other the alternative or research hypothesis. The
usual notation is:
Pronounced H “nought”
• H0: — the ‘null’ hypothesis
• H1: — the ‘alternative’ or ‘research’ hypothesis
• The null hypothesis (H0) will always state that the

parameter equals the value specified in the alternative
hypothesis (H1)
Types of Errors
• A Type I error occurs when we reject a true null

hypothesis (i.e. Reject H0 when it is TRUE)
H0 T F
Reject I
Reject II
• A Type II error occurs when we don’t reject a false

null hypothesis (i.e. Do NOT reject H0 when it is
FALSE)
11.6
Hypothesis Testing - Example
Consider mean demand for computers in a sales

store.
Rather than estimate the mean demand, operations

manager wants to know whether the mean is
different from 350 units.
In other words, someone is claiming that the mean

time is 350 units and manager wants to check this
claim out to see if it appears reasonable.
We can rephrase this request into a test of the

hypothesis:
H0: = 350
Thus, our research hypothesis becomes:
H1: ≠ 350
Let the standard deviation [σ] can be assumed

to be 75, the sample size [n] as 25, and the
sample mean [ ] was calculated to be 370.16
For example, if we’re trying to decide whether the

mean is not equal to 350, a large value of (say, 600)
would provide enough evidence.
If is close to 350 (say, 355) we could not say that this

provides a great deal of evidence to infer that the
population mean is different than 350.
Two possible decisions that can be made:
Conclude that there is enough evidence to support the alternative

hypothesis
(also stated as: reject the null hypothesis in favor of the
alternative)
Conclude that there is not enough evidence to support the

alternative hypothesis
(also stated as: failing to reject the null hypothesis in favor of the
alternative)
• The testing procedure begins with the assumption

that the null hypothesis is true.
• Thus, until we have further statistical evidence, we
will assume:
• H0: = 350 (assumed to be TRUE)
• The next step will be to determine the sampling
distribution of the sample mean assuming the
true mean is 350.
• is normal with 350
• 75/SQRT(25) = 15
Is the Sample Mean in the Guts of the Sampling
Distribution??
First way
1. Unstandardized test statistic: Is in the guts of
the sampling distribution? Depends on what you
define as the “guts” of the sampling distribution.
• If we define the guts as the center 95% of the
distribution [this means  = 0.05], then the
critical values that define the guts will be 1.96
standard deviations of X-Bar on either side of the
mean of the sampling distribution [350], or
• UCV = 350 + 1.96*15 = 350 + 29.4 = 379.4
• LCV = 350 – 1.96*15 = 350 – 29.4 = 320.6
First way
Second way
• Standardized test statistic: Since we defined the “guts” of
the sampling distribution to be the center 95% [ = 0.05],
• If the Z-Score for the sample mean is greater than 1.96,
we know that will be in the reject region on the right
side or
• If the Z-Score for the sample mean is less than -1.97, we
know that will be in the reject region on the left side.
• Z=( - )/ ) = (370.16 – 350)/15 = 1.344
• Is this Z-Score in the guts of the sampling distribution???
11.15
Second way
Third way
• 3. The p-value approach (which is generally used with a computer

and statistical software): Increase the “Rejection Region” until it
“captures” the sample mean.
• For this example, since is to the right of the mean, calculate

P( > 370.16) = P(Z > 1.344) = 0.0901
• Since this is a two tailed test, this area is doubled for the p-value.
p-value = 2*(0.0901) = 0.1802
• Since we defined the guts as the center 95% [ = 0.05], the reject
region is the other 5%. Since our sample mean, , is in the 18.02%
region, it cannot be in our 5% rejection region [ = 0.05].
11.17
Third way
Conclusions of hypothesis tests
• Unstandardized Test Statistic:

Since LCV (320.6) < (370.16) < UCV (379.4), we fail
reject the null hypothesis at a 5% level of significance.
• Standardized Test Statistic:
Since -Z/2(-1.96) < Z(1.344) < Z/2 (1.96), we fail to reject
the null hypothesis at a 5% level of significance.
• P-value:
Since p-value (0.1802) > 0.05 [], we fail to reject the hull
hypothesis at a 5% level of significance.
Second Example
• The system will be cost effective if the mean account balance for
each customer is greater than INR 170.
• We express this belief as a our research hypothesis, that is:

H1: > 170 (this is what we want to determine)
• Thus, our null hypothesis becomes:

H0: <= 170
Second Example
• What we want to show:

H1: μ > 170
H0: μ < 170 (we’ll assume this is true)
• Normally we put Ho first.
• Given:
n = 400,
= 178, and
= 65
= 65/SQRT(400) = 3.25
 = 0.05
Second Example
• The rejection region is a range of values such that

if the test statistic falls into that range, we decide
to reject the null hypothesis in favor of the
alternative hypothesis.
Second Example
• At a 5% significance level (i.e. =0.05), we get [all  in one tail]

Z = Z0.05 = 1.645
Therefore, UCV = 170 + 1.645*3.25 = 175.35
• Since our sample mean (178) is greater than the critical value we
calculated (175.35), we reject the null hypothesis in favor of H1
• OR
(>1.645) Reject null
• OR
p-value = P( > 178) = P(Z > 2.46) = 0.0069 < 0.05 Reject null
11.23
Second Example
H1: > 170 =175.34

H0: = 170 =178
Reject H0 in favor of
Interpreting p-value
Overwhelming Evidence
(Highly Significant)
Strong Evidence
(Significant)
Weak Evidence
(Not Significant)
No Evidence
(Not Significant)
0 .01 .05 .10
p=.0069
Summary of hypothesis testing
One-Tail Test Two-Tail Test One-Tail Test

(left tail) (right tail)
t- Test
The Student’s t-test compares the averages and standard deviations of two samples to
see if there is a significant difference between them.
We start by calculating a number, t
t can be calculated using the equation:
( x1 – x2 ) Where:
t= x1 is the mean of sample 1
(s1)2 (s2)2 s1 is the standard deviation of sample 1
+ n1 is the number of individuals in sample 1
n1 n2
x2 is the mean of sample 2
s2 is the standard deviation of sample 2
n2 is the number of individuals in sample 2
t- Test
Example: Random samples were taken of pupils in C1 and D8
Their recorded heights are shown below…
Students in C1 Students in D8
Student 145 149 152 153 154 148 153 157 161 162
Height
154 158 160 166 166 162 163 167 172 172
(cm)
166 167 175 177 182 175 177 183 185 187
Step 1: Work out the mean height for each sample
C1: x1 = 161.60 D8: x2 = 168.27
Step 2: Work out the difference in means
x2 – x1 = 168.27 – 161.60 = 6.67

t- Test
Step 3: Work out the standard deviation for each sample
C1: s1 = 10.86 D8: s2 = 11.74
Step 4: Calculate s2/n for each sample
C1: (s1)2
= 10.862 ÷ 15 = 7.86
n1
D8: (s2)2
= 11.742 ÷ 15 = 9.19
n2
t- Test
Step 5: Calculate (s1)2 (s2)2
+
n1 n2
(s1)2 (s2)2
+ = (7.86 + 9.19) = 4.13
n1 n2
Step 6: Calculate t (Step 2 divided by Step 5)
x2 – x1
t=
6.67
(s1)2 (s2)2 = 1.62
+ =
4.13
n1 n2
t- Test
Step 7: Work out the number of degrees of freedom
d.f. = n1 + n2 – 2 = 15 + 15 – 2 = 28
Step 8: Find the critical value of t for the relevant number of degrees of freedom
Use the 95% (p=0.05) confidence limit
Critical value = 2.048
Our calculated value of t is below the critical value for 28d.f., therefore, there is no
significant difference between the height of students in samples from C1 and D8

4.hypothesis Testing

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

4.hypothesis Testing

Uploaded by

Copyright:

Available Formats

Hypothesis Testing

Central Limit Theorem

• As more and more samples are taken from a

• The null hypothesis can be quantified

• The two hypotheses are called the null hypothesis and

• H0: — the ‘null’ hypothesis

• H1: — the ‘alternative’ or ‘research’ hypothesis

• The null hypothesis (H0) will always state that the

• A Type I error occurs when we reject a true null

• A Type II error occurs when we don’t reject a false

Consider mean demand for computers in a sales

Rather than estimate the mean demand, operations

In other words, someone is claiming that the mean

We can rephrase this request into a test of the

Thus, our research hypothesis becomes:

Let the standard deviation [σ] can be assumed

For example, if we’re trying to decide whether the

If is close to 350 (say, 355) we could not say that this

Conclude that there is enough evidence to support the alternative

Conclude that there is not enough evidence to support the

• The testing procedure begins with the assumption

• 3. The p-value approach (which is generally used with a computer

• For this example, since is to the right of the mean, calculate

• Unstandardized Test Statistic:

• We express this belief as a our research hypothesis, that is:

• Thus, our null hypothesis becomes:

• What we want to show:

• The rejection region is a range of values such that

• At a 5% significance level (i.e. =0.05), we get [all  in one tail]

(>1.645) Reject null

H1: > 170 =175.34

0 .01 .05 .10

One-Tail Test Two-Tail Test One-Tail Test

We start by calculating a number, t

t can be calculated using the equation:

Their recorded heights are shown below…

Step 1: Work out the mean height for each sample

C1: x1 = 161.60 D8: x2 = 168.27

Step 2: Work out the difference in means

x2 – x1 = 168.27 – 161.60 = 6.67

C1: s1 = 10.86 D8: s2 = 11.74

Step 4: Calculate s2/n for each sample

Step 6: Calculate t (Step 2 divided by Step 5)

Use the 95% (p=0.05) confidence limit

Critical value = 2.048

You might also like