You are on page 1of 9

Not logged in Talk Contributions Create account Log in

Article Talk

Read Edit View history

Search

Post hoc analysis


From Wikipedia, the free encyclopedia
Main page
Contents
Featured content
Current events

Not to be confused with Post hoc theorizing.


In the design and analysis of experiments, post hoc analysis (from Latin post hoc, "after this")
consists of looking at the dataafter the experiment has concludedfor patterns that were not

Random article

specified a priori. It is sometimes called data dredging by critics to evoke the sense that the more

Donate to Wikipedia

one looks the more likely something will be found. More subtly, each time a pattern in the data is

Wikipedia store

considered, a statistical test is effectively performed. This greatly inflates the total number of

Interaction

statistical tests and necessitates the use of multiple testing procedures to compensate. However,

Help

this is difficult to do precisely and in fact most results of post hoc analyses are reported as they

About Wikipedia

are with unadjusted p-values. These p-values must be interpreted in light of the fact that they are

Community portal

a small and selected subset of a potentially large group of p-values. Results of post hoc analyses

Recent changes
Contact page
Tools
What links here

should be explicitly labeled as such in reports and publications to avoid misleading readers.
In practice, post hoc analyses are usually concerned with finding patterns and/or relationships
between subgroups of sampled populations that would otherwise remain undetected and

Related changes

undiscovered were a scientific community to rely strictly upon a priori statistical

Upload file

methods.[citation needed] Post hoc testsalso known as a posteriori testsgreatly expand the range

Special pages

and capability of methods that can be applied in exploratory research. Post hoc examination

Permanent link

strengthens induction by limiting the probability that significant effects will seem to have been

Page information
open in browser PRO version

Are you a developer? Try out the HTML to PDF API

pdfcrowd.com

Page information
Wikidata item
Cite this page

discovered between subgroups of a population when none actually exist. As it is, many scientific
papers are published without adequate, preventative post hoc control of the type I error rate.[1]
Post hoc analysis is an important procedure without which multivariate hypothesis testing would

Print/export
Create a book

greatly suffer, rendering the chances of discovering false positives unacceptably high. Ultimately,

Download as PDF

post hoc testing creates better informed scientists who can therefore formulate better, more

Printable version

efficient a priori hypotheses and research designs.

Languages

Contents

Deutsch
Edit links

1 Relationship with the multiple comparisons problem


2 Tests
2.1 Fisher's least significant difference (LSD)
2.2 The Bonferroni procedure
2.3 HolmBonferroni method
2.4 NewmanKeuls method
2.5 Duncan's new multiple range test (MRT)
2.6 Rodger's method
2.7 Scheff's method
2.8 Tukey's procedure
2.9 Dunnett's correction
2.10 Sidk's inequality
2.11 BenjaminiHochberg (BH) procedure
3 See also
4 References
5 Bibliography

Relationship with the multiple comparisons problem


open in browser PRO version

Are you a developer? Try out the HTML to PDF API

[edit]
pdfcrowd.com

Further information: Multiple comparisons


In its most literal and narrow sense, post hoc analysis simply refers to unplanned data analysis
performed after the data is collected in order to reach further conclusions. In this sense, even a
test that does not provide Type I Error Rate[1] protection, using multiple comparisons methods, is
considered as post hoc analysis. A good example is performing initially unplanned multiple t-tests
at level

, following an

level anova test. Such post hoc analysis does not include multiple testing

procedures, which are sometimes difficult to perform precisely. Unfortunately, analyses such as the
above are still commonly conducted and their results reported with unadjusted p-values. Results of
post hoc analyses which do not address the multiple comparisons problem should be explicitly
labeled as such to avoid misleading readers.
In the wider and more useful sense, post hoc analysis tests enable protection from the multiple
comparisons problem, whether the inferences made are selective or simultaneous. The type of
inference is related directly to the hypotheses family of interest. Simultaneous inference indicates
that all inferences, in the family of all hypotheses, are jointly corrected up to a specified type I error
rate. In practice, post hoc analyses are usually concerned with finding patterns and/or
relationships between subgroups of sampled populations that would otherwise remain undetected
and undiscovered were a scientific community to rely strictly upon a priori statistical
methods[citation needed]. Therefore, simultaneous inference may be too conservative for certain large
scale problems that are currently being addressed by science. For such problems, a selective
inference approach might be more suitable, since it assumes that sub-groups of hypotheses from
the large scale group can be viewed as a family. Selective post hoc examination strengthens
induction by limiting the probability that significant differences will seem to have been discovered
between sub-groups of a population when none actually exist. Accordingly, p-values of such subgroups must be interpreted in light of the fact that they are a small and selected subset of a
potentially large group of p-values.
open in browser PRO version

Are you a developer? Try out the HTML to PDF API

pdfcrowd.com

Tests

[edit]

Further information: Family-wise error rate Controlling procedures


The following are referred to as "post hoc tests". However, on some occasions a researcher may
have initially planned on using them, thus referring to them as "post-hoc tests" is not entirely
accurate. For instance, The NewmanKeuls and Tukey's methods are often referred to as post
hoc. However, it is not uncommon to plan on testing all pairwise comparisons before seeing the
data. Therefore, in such cases, these tests are better categorized as a priori.

Fisher's least significant difference (LSD)

[edit]

This technique was developed by Ronald Fisher in 1935 and is used most commonly after a null
hypothesis in an analysis of variance (ANOVA) test is rejected (assuming normality and
homogeneity of variances). A significant ANOVA test only reveals that not all the means compared
in the test are equal. Fisher's LSD is basically a set of individual t-tests, differentiated only in the
calculation of the standard deviation. In each t-test, a pooled standard deviation is computed from
only the two groups being compared, while the Fisher's LSD test computes the pooled standard
deviation from all groups - thus increasing power. Fisher's LSD does not correct for multiple
comparisons.[2]

The Bonferroni procedure

[edit]

Main article: Bonferroni correction


Denote by
reject

the p-value for testing

if

being the number of hypotheses


open in browser PRO version

Are you a developer? Try out the HTML to PDF API

pdfcrowd.com

Although mainly used with planned contrasts, it can be used as a post hoc test for comparisons
between data groups of interest in the experiment after the fact. It is flexible and very simple to
compute, but naive in its idea of retaining of familywise error rate by division by

. This method

results in a large reduction in the power of the test. That is, because the cut-off value is reduced, it
becomes substantially more difficult for any result to be concluded as being statistically significant,
irrespective of whether it is true or not.

HolmBonferroni method

[edit]

Main article: HolmBonferroni method


Start by ordering the p-values

Let

be the smallest

and let the associated hypotheses be

such that

Reject the null hypotheses

. If

then none of the hypotheses are

rejected.
This procedure is uniformly better than Bonferroni's.
It is worth noticing here that the reason why this procedure controls the family-wise error rate
for all the m hypotheses at level in the strong sense is because it is essentially a closed
testing procedure. As such, each intersection is tested using the simple Bonferroni test.
The Bonferroni-Holm method introduces a correction to Bonferroni's method that allows more
rejections, and is therefore less conservative and more powerful than the Bonferroni method.

NewmanKeuls method

[edit]

Main article: NewmanKeuls method


open in browser PRO version

Are you a developer? Try out the HTML to PDF API

pdfcrowd.com

A stepwise multiple comparisons procedure used to identify sample means that are significantly
different from each other. It is used often as a post hoc test whenever a significant difference
between three or more sample means has been revealed by an analysis of variance (ANOVA)

Duncan's new multiple range test (MRT)

[edit]

Main article: Duncan's new multiple range test


Duncan developed this test as a modification of the NewmanKeuls method that would have
greater power. Duncan's MRT is especially protective against false negative (Type II) error at the
expense of having a greater risk of making false positive (Type I) errors.

Rodger's method

[edit]

Main article: Rodger's method


Rodger's method is a procedure for examining research data post hoc following an 'omnibus'
analysis, that is after carrying out an analysis of variance (ANOVA). Rodger's method utilizes a
decision-based error rate, arguing that it is not the probability ( ) of rejecting

in error that

should be controlled, rather it is the average rate of rejecting true null contrasts that should be
controlled. Meaning we should control the expected rate (

Scheff's method

) of true null contrast rejection.

[edit]

Main article: Scheff's method


Scheff's method applies to the set of estimates of all possible contrasts among the factor level
means, not just the pairwise differences. Having an advantage of flexibility, it can be used to test
any number of post hoc simple and/or complex comparisons that appear interesting. However, the
drawback of this flexibility is a low type I error rate, and a low power.
open in browser PRO version

Are you a developer? Try out the HTML to PDF API

pdfcrowd.com

Tukey's procedure

[edit]

Main article: Tukey's range test


Tukey's procedure is only applicable for pairwise comparisons.
It assumes independence of the observations being tested, as well as equal variation across
observations (homoscedasticity).
The procedure calculates for each pair the studentized range statistic:
the larger of the two means being compared,

is the smaller, and

where

is

is the standard error

of the data in question.


Tukey's test is essentially a Student's t-test, except that it corrects for family-wise error-rate.
A correction with a similar framework is Fishers LSD (least significant difference).

Dunnett's correction

[edit]

Main article: Dunnett's test


Charles Dunnett (1955, 1966; not to be confused with Dunn) described an alternative alpha error
adjustment when k groups are compared to the same control group. Now known as Dunnett's test,
this method is less conservative than the Bonferroni adjustment.

Sidk's inequality

[edit]

Main article: idk correction for t-test

BenjaminiHochberg (BH) procedure

[edit]

Main article: BenjaminiHochberg procedure


BH-procedure is a step-up procedure iterating over
open in browser PRO version

Are you a developer? Try out the HTML to PDF API

null hypotheses tested and


pdfcrowd.com

, their ordered p-values in an increasing order. The method then proceeds to


identify the rejected null hypotheses from the above set, whilst controlling the false discovery rate
(at level ) under the premise that the total

hypotheses are independent or positively

correlated.

See also

[edit]

ANOVA
The significance level (alpha) in statistical hypothesis testing
Subgroup analysis
Statistical power
Testing hypotheses suggested by the data

References
1. ^ a

[edit]

Jaccard, J.; Becker, M. A.; Wood, G. (1984). "Pairwise multiple comparison procedures: A

review". Psychological Bulletin 96 (3): 589. doi:10.1037/0033-2909.96.3.589 .


2. ^ Hayter, A. J. (1986). "The Maximum Familywise Error Rate of Fisher's Least Significant Difference
Test". Journal of the American Statistical Association 81 (396): 10001004. doi:10.2307/2289074 .

Bibliography

[edit]

James E. Carlson and Others (1975). "The Distribution of the Test Statistic Used in the
NewmanKeuls Multiple Comparison Technique" , Annual Meeting of the American
Educational Research Association (Washington, D. C., March 30April 3, 1975)
Klockars, A. J.; Hancock, G. R. (2000). "Scheff's More Powerful F-Protected Post Hoc
Procedure". Journal of Educational and Behavioral Statistics 25 (1): 1319.
open in browser PRO version

Are you a developer? Try out the HTML to PDF API

pdfcrowd.com

doi:10.2307/1165310 .
Categories: Data analysis

Multiple comparisons

Clinical research

Medical statistics

This page w as last modified on 29 June 2016, at 12:15.


Text is available under the Creative Commons Attribution-ShareAlike License; additional terms may apply. By using this site, you
agree to the Terms of Use and Privacy Policy. Wikipedia is a registered trademark of the Wikimedia Foundation, Inc., a non-profit
organization.
Privacy policy

open in browser PRO version

About Wikipedia Disclaimers

Are you a developer? Try out the HTML to PDF API

Contact Wikipedia Developers

Cookie statement Mobile view

pdfcrowd.com

You might also like