You are on page 1of 5

Power of a Test: Figure 12.1: Power of the Test, = 0.

05
The power of a test is a way to a assess the quality of a
hypotesis testing. The sample size, , is a critical factor on
how good a testing is.
Types of Errors:
As disccussed earlier, rejecting the Null Hypotesis (Ho)
when it is true is a Type I Error, with probability . At the
other hand, failing to reject Ho when it is false is a Type II
Error, denoted .
Type 1 errors are very easy to spot graphically (yellow
region in Fig 12.1). Type II errors are a little bit more
tricky. Lets begin with that.
Suppose Ha is true (Fig 12.1), whose value you, the
researcher, dont know, but it is close enough to Ho so Figure 12.2: Probability of Type II Error
distributions A and B have significant areas in common
(regions yellow and red). The portion of distrinution B
that overlaps distribution A is Beta (red region). Why?
Because any falling in that region would be interpreted
as belonging to distribution A, namely a Type II error
(Failiing to reject Ho when Ha is really True).
The rest of distribution B is what we call the Power of the
Test, 1 (Black dotted region). Note that (other things
constant) to increase The Power of the Test I need to
make smaller, which implies making numerically
bigger (smaller LOC, C).
CASE STUDY: The Brinell scale is used in Material
Sciences to quantify the hardness of metals. The
hardness of a certain type of material used as = ( 0.5) 0.3085
reinforcement in concrete & masonry is assumed to be
So, there is about a 30% chance that you make the wrong
normally distributed, with a hardness of 170 kg/mm2 &
decision if the true mean is . The Challenge is how I
= 10 kg/mm2. If rejection level is set at 172 kg/mm2
decrease that error () while keeping constant or even
a) what is the level of significance?
lowering the probability of a Type I error ().
b) If unknown to the tester, the true mean equals 173,
calculate the probability of a Type II error. Part C: The Power of the Test
The power of the test is the probability that I make the
c) What is the Power of the Test?
right decision should Ha be true. Look at Fig 12.1, it is the
It is a right tale test. The level of significance would be black dotted area. Since the total area under the curve
the area beyond 172. So you calculate its z-value and use equals 1, The Power of the Test is 1 , which is about
NORM.S.DIST(-z, True) to find the area to the right.
1 0.3085 0.6915.
172 170
= = =1 In real life we want to design our tests such that:
10
25 (a) We have enough confidence to fail to reject the null
hypothesis should it be true but also,
= NORM.S.DIST(-1,True) 0.1587
(b) Have enough power to reject it in favor of the
Part B: Type II Error alternate hypothesis should Ho be false.
Presuming that = 173 is the true mean, the error
P( 172). See RED region in Fig 12.2 That sounds GREAT!!!, but how do we accomplish it
when we dont know what the true value of the mean is?
172 173
= = = 0.5 Here is where the Power Functions comes into play. Lets
10 introduce it through a case study.
25
----
CASE STUDY: Power Functions Figure 12.3:
Let X Denote the IQ of a randomly selected adult
american. Pressume that X is normally distributed with
unknown mean and std deviation 16. Take a random
sample of 16 students. With a type I error of 0.05 test,
: = 100
: > 100
Calculate the Power of the Test for = 108, 112, 116
Presuming that = 108 is the true mean, the error
P( 106.58). See RED region in Fig 12.3
The z-value for = 0.05 is 1.645. The corresponding

to that point can be obtained by isolating in = ,



which after some algebra results in: = + ( ).

16
= 100 + 1.645 ( ) 106.58. From the perspective
16
of Ha, the z value corresponding to 106.58 is:
106.58 108
= = = 0.36
16
16
= 1 ( 0.36), where ( ) is known
as (). Hence, = 1 (0.36) 1 0.36 or
0.64.
If we repeat the process for = 112, 116 we get the
following results:
Table 12.1 Power as a Function of
Power Figure Note that Power of the
108 0.64 12.3 A test is the black dotted
112 0.91 12.3 B area in the figures.
116 0.99 12.3 C
By looking at table 12.1 what you can conclude is that as
the difference between and grows the Power of the
Test increases. However, there is a catch. This was an
exercise purely academic because we never know the
true mean. If we knew, it wont be necessary to make a
test in the 1st place. Nevertheless, it is not an exercise in
futility. It stablish the basis to develop the math we need
for more practical scenarios. would become:

We defined the Power of the Test as : . () = 1 ( )



. = 1 ( ) where, If you chart () vs it will look like Fig 12.4. Note that


it provides the following information: , ( ), ( ) .
: Is the unknown true average
Note also that ( ) = . Why? Lets begin with F12.2.
Standard deviation of the population +
= + ( ). Hence, () = (
)
: + ( ), is the associated to the level.
+
Letting = , you get () = (
),
Lets say that we let () be the Power of the Test, so we

can express it in function notation. The formula 12.1 which results in () = (
) = ( ).
----
Since is positive and areas under the normal std Figure 12.4: The Power Function ()
distribution accumulates from left to right, ( )
equals . How is that? Suppose = . , would be
1.645, and ( ) = . . Hence, () = ( )
results in () = . or () = .
Note that () = ( ) is equivalent to say
() = (), or () = (). Both relations
can be seen in Fig 12.4.
So far we know the relationship between , and the
Power of the test. We would like a high LOC (1-) and a
high power. However, , are inversely proportional.
High confidence () implies high risk, and high risk
implies lower Power of the Test. As can be seen in Fig
12.1, reproduced in Fig 12.5. The Challenge is to increase Figure 12.5: Power of the Test, = 0.05
the LOC and the Power of the Test at the same time.
Which implies decreasing both the at the same
time.
Table 12.1 showed that we can reach that when
separation between increases. However, it is
not practical. We need a variable that we can control.
Note that it happens when the overlapping between A
and B distributions (Fig 12.5) diminishes.
Ohhh, there is another way to accomplish that, by
reducing . How do I reduce it? well, I would have
either to reduce or to increase . For a given
population under study, I have no control over , so my
only real choice is to increase the sample size.
Table 12.2 Power as a Function of and Sample Size
So, what do I really want? I would like to know the Power Power
sample size needed to achieve certain Level of Confidence = 16 = 64
at the same time I achieve some specific Power of the 108 0.64 0.991
Test. 112 0.91 0.99999
In table 12.1 we explored the effect of on the power 116 0.99 1.0
of the test (other things constant). This time well explore
the effect of the sample size (Table 12.2). Lets consider
() = 1 ( )
the same IQ case:

CASE STUDY: Sample Size 103.29 108


Let X Denote the IQ of a randomly selected adult () = 1 ( ) = 1 (2.35) .
16
american. Pressume that X is normally distributed (a bit 64
unrealistic) with unknown mean and strangely known std 103.29112
deviation of 16. Take a random sample of 64 students. () = 1 ( 16 ) = 1 (4.36) .
64
With a type I error of 0.05 test,
: = 100 0.05 1.645 103.29 116
() = 1 ( ) = 1 (4.36)
: > 100 16
64
Calculate the Power of the Test for = 108, 112, 116 Ok, I just showed that the sample size affects the Power
16 of the Test. Other things constant, the larger the sample
= + ( ) = 100 + 1.645 ( ) = 103.29
64 size, the higher the Power of the Test, as can be seen in
Fig 12.6. Now, How can I calculate the sample size?
----
Calculating the Sample Size: Figure 12.6: Power of the Test vs Sample Size
CASE STUDY: Let X denote the crop yield of corn
measured in the number of bushels per acre. Assume,
somewhat unrealistically, that X is normally distributed
with unlnow mean and std deviation 6. A researcher is
working to increase the current average yield from 40
bushels per acre. Therefore, at a95% LOC he is interested
in testing the null hypothesis : = 40, against the
alternate hypothesis : > 40. Find the sample size
necessary to achieve 0.90 power at the alternative =
45.
Lets analyze it. Somewhere between 40 and 45 there is a
point for (some sample size ), where I can achieve 95%
LOC and 0.90 Power of the Test.
From the perspective of that point is: = +
6
( ) = 40 + 1.645 ( ). From the perspective of ,

6 Figure 12.7: Calculating the Sample Size
the same point would be = 45 1.28 ( ). So, Im

looking at the same from two different perspectives:
6 6
= 40 + 1.645 ( ) = 45 1.281 ( )

Grouping similar terms, Im left with:
6 6
1.645 ( ) + 1.281 ( ) = 45 40

6
Or, 2.926 ( ) = 5. Solving for , you get 12.33, or

= 13. Having n, you can calculate .
6
= 40 + 1.645 ( ) 42.74
13
So, What does it means? ( + ) = ( )
What it means is that if the researcher set the sample size ( + )
at 13, and reject the null hypothesis when > 42.74, he ( ) =

will have a 5% chance of committing a Type I error and a 2
( + )
10% chance of committing a Type II error should the . = [ ]
alternate hypothesis be true. ( )

***BIG QUESTIONS*** Remember this formula was developed for a one-tail test
(a) Is there a formula to do this calculation? and under the assumption that you knew . For unkbown
(b) Where does comes from? , must be replaced by . For a two-tailed test the
The Formula: formula is:
2
From the perspective of , let be and from the (2 + )
perspective of , let be . . = [ ]
( )
= + ( ), = ( )

= Where does comes from?

+ ( )
= ( ), To be continued

https://onlinecourses.science.psu.edu/stat414/node/306
( ) + ( ) =


( + ) =

----
----

You might also like