Fisher ExactFisher Exact Variance Test For The Poisson Distribution Variance Test For The Poisson Distribution

Algorithm AS 171: Fisher's Exact Variance Test for the Poisson Distribution
Author(s): E. L. Frome
Source: Journal of the Royal Statistical Society. Series C (Applied Statistics), Vol. 31, No. 1
(1982), pp. 67-71
Published by: Wiley for the Royal Statistical Society
Stable URL: http://www.jstor.org/stable/2347079 .
Accessed: 27/12/2014 01:11
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
Wiley and Royal Statistical Society are collaborating with JSTOR to digitize, preserve and extend access to
Journal of the Royal Statistical Society. Series C (Applied Statistics).
http://www.jstor.org
This content downloaded from 136.159.235.223 on Sat, 27 Dec 2014 01:11:39 AM

All use subject to JSTOR Terms and Conditions
67
AS 171
Algorithm
Fisher'sExact VarianceTest forthe Poisson Distribution

By E. L. FROMEt
Oak Ridge,Tennessee,USA
Medical and HealthSciencesDivision,Oak RidgeAssociatedUniversities,
[ReceivedJanuary1980. Final revisionDecember 1980]
Keywords: INDEX OF DISPERSION; HOMOGENEITY OF VARIANCE; CHI-SQUARE

LANGUAGE
Fortran66
DESCRIPTION AND PURPOSE
Let Yi1Y2, Yn denote n observations from a Poisson population, and let fi, j = 0,..., k
denote the frequencywith which the numberj occurs in the given sample. (Note that
forthe
k = max {yi,i = 1,...,n}.) It is wellknownthatthesampletotal T= E% 1 yi is sufficient
Poisson parameterand thattheindexof dispersion
(Yi_Y)2/
as a x2statisticwith(n- 1) degreesoffreedomwhenn is largeand

distributed
is approximately
y = T/nis over3 (Rao and Chakravarti,1956).The indexofdispersionis widelyused to testfor
In
to as the variancetestforhomogeneity.
of the observationsand is referred
homogeneity
ofD is notaccuratelygivenby
thesamplingdistribution
manysituationsofpracticalinterest,
the x2approximation,and Fisher (1950) proposed an exact test which is based on the
conditionaldistribution
(1)
P(fo f, .I T) = C{ [Jo!f1!f2! ... ] [(2!)f2(3!)f3 ...]}-1,
whereC = T! n!/nT.The indexof dispersiondependson E y2 = E j2 fj,and ifS denotesthe
value obtained for a given value of T and n, then the desiredprobabilityis obtained by
thetotalfrom1.
summing(1) overall possiblesampleswithE y2 less thanS and subtracting
That is, conditionalon 7Twe computethe probabilityof obtaininga value of the index of
dispersionthatis greaterthanor equal to theobservedvalue,if,in fact,theobservationsare
fromthe same Poisson population.
NUMERICAL METHOD
Partitionsof the sample total Tare generatedusinga modifiedversionof an algorithm
given by Lehmer (Section 1.8, 1964). A partition of Tis definedby T= a1 + a2 +... + am,where
a1, a2 a...< am,and m is thenumberof"parts".We beginwith
theai's are positiveintegers,
in dictionary
orderas described
m = min(n,T), and thengenerateall possiblempartpartitions
by Lehmer.Then,iftheYi a3 is less thantheobservedvalue S, themultinomialprobability(1)
is evaluated.Note thatifa*, a2, ... a* is the"last"mpartpartitionand S* = Em= i a*2,then S* is
and S* is lessthan k 1b2,whereb1,...,b, is a klessthanE a3 foreveryotherm-partpartition,
part partitionwithk<im. Consequently,ifS < S*, thenthe sum of squares forall partitions
withless than m parts will exceed S, and no further
computationis required.The desired
probabilityis obtained by summingthe probabilitiesfor each "admissible"partitionand
thistotalfrom1.
subtracting
t Presentaddress:Mathsand StatisticsResearchDept.,Union CarbideCorporation,Oak Ridge,Tennessee,U.S.A.

1982 Royal StatisticalSociety
0035-9254/82/31067 $2.00

68
APPLIED STATISTICS
STRUCTURE
SUBROUTINE TPSHV(N, T,NTMAX, SY2, PROB, KGP, KPR, A, KF, DLF, IFAULT)
Formalparameters
N
Integer
input:samplesize (n)
T
Integer
input:sampletotal
NTMAX
Integer
input:maximum(N, T)
SY2
Integer
input:samplesum of squares
PROB
Real
output:probability
KGP
Integer
output:numberof partitionsgenerated
KPR
Integer
output:numberof partitionsfor which equation (1)
was evaluated
A
Integerarray(T)
workspace:
KF
Integerarray(T)
workspace:
DLF
Real (NTMAX)
workspace:used to storelog factorials
IFAULT
Integer
output:faultindicator,equal to:
1 PROB is onlyan upperbound on theexact
probability(see RESTRICTIONS);
2 ifN<2 or N>NTMAX;
3 if T<2, T>NTMAX or T>MAXT
(see RESTRICTIONS);
0 otherwise
RESTRICTIONS
The local constant,MAXT7 provides an upper limit on the sample total for which
partitionscan be generated.The local constant,MAXP, is used to limitthemaximumnumber
ofpartitionsthatwillbe evaluated.If thislimitis reached,IFAULTis setequal to 1; and the
value ofPROB is an upperboundon theexactprobability.
WithMAXP = 1000 000,theexact
probabilitywillbe obtainedwheneverTis less than60, providedit is not less thanDMIN.
The local constant,DMIN, is used to definethesmallestprobability
thatcan be computed.
If thisvalue is exceeded,thenIFAULTis set equal to 1, and PROB is assignedthevalue of
DMIN. It is suggestedthatDMIN be at least 100 timesthe precisionof the computer.
TIME
The numberof partitionsof Tis approximatelyequal to (4T 1,/3)-'exp(Tc1/23/T),

and
executiontimeis proportionalto the numberof partitionsthatmustbe generated.It is not
necessaryto generateall partitionsof T (see NUMERICALMETHOD), and the numberof
partitionsthatmustbe generateddependson the observedvalue of the indexof dispersion
whichis determined
by E y2 fora givenvalue ofn and T. The situationis complicatedby the
factthat the executiontimeincreasesand then decreasesas n increasesfora givenvalue
ofT. Thisis illustrated
inTable 1 whichcontainsexecutiontimeson a DEC PDP-10 forvarious
values of T and n. The values reportedin Table 1 are CPU times (and total number
of partitionsgenerated)requiredto reach significance
at the 0 01 and the 0 0001 levels.
PRECISION
Ifthewordsize ofthecomputeris 36 bitsor less,theusershouldspecifydouble precision

arithmeticfor the calculation and accumulationof probabilities.The REAL declaration
should be changedto DOUBLE PRECISION; the DATA statementmodified;ALOG and
EXP changed to DLOG and DEXP; and the statementlabelled 200 changed to 200
PROB = SNGL(D 1-DPRB).
ADDITIONAL COMMENTS
of thealgorithmcan be improvedby allocatingarraysA, KF and DLF for

The efficiency

69
STATISTICAL ALGORITHMS
TABLE 1
Typicalcomputer
timest
Significance
level
0*01
Sample
total
T
Sample
size
n
Numberof
partitionst
32
8
16
32
48
64
128
8
16
32
48
64
128
2 968
5 265
2 641
1 210
915
97
23 666
87 841
50 646
22 855
11723
1212
48
0*0001
Time
37
64
33
15
12
4
207
873
574
278
149
23
Numberof
partitionst
Time
3 217
6 834
4 143
2 641
1 588
373
24482
97 590
72 302
42 398
22 855
4 508
58
104
53
32
22
7
344
1268
869
489
277
65
ofa second)
on DEC PDP-10 computer.The timeshownis CPU time(in hundredths
t All calculationsperformed
to findFisher'sexactprobability
fortheindexofdispersionfora givenvalue of Tand n.The indexofdispersionvalue
levelgivenat thetop of thecolumn.
was determined
so thatexact probabilitywouldjust reachthesignificance
of Tthatweregenerated(thetotalnumberofpartitionsof32 and 48 are 8349 and 147273,
t Numberofpartitions
respectively).
This requiresadditionalerrorcheckingto ensurethatTdoes not

privateuse ofthesubroutine.
exceed the maximumdimensionsof A and KF and that N does not exceed the maximum
dimensionofDLF. If TPSHVis called repeatedlyby anotherprogramunit,thenthevaluesof
the log factorialsin DLF can be computedonce in the callingprogramunit and labelled
COMMON used.
ACKNOWLEDGEMENT
This reportis based on work performedunder Contract No. DE-AC05-760R00033

Research,and
betweenthe U.S. Departmentof Energy,Officeof Health and Environmental
Oak Ridge AssociatedUniversities.
REFERENCES
6, 17-24.
of deviationsfromexpectationsin a Poisson series.Biometrics,
FISHER, R. A. (1950). The significance
Mathematics
(E. F. Bechenbach,
In AppliedCombinatorial
LEHMER,D. H. (1964).The machinetoolsofcombinatorics.
ed.). New York: Wiley.
for a Poisson distribution.

RAO, C. R. and CHAKRAVARTI,I. M. (1956). Some small sample testsof significance
12, 264-282.
Biometrics,
C
C
C
C
SUBROUTINE TPSHV(Ni
DLFi IFAULT:)
ALGORITHM AS 171
FISHERS
Tv NTIIAXY SY29
APPLa
STATIST.
EXACT VARIANCE TEST
PROE4 KOPv KPR,

(1982)
VOL.31v
FOR THE POISSON
A,
NO-3
DISTRIBUTION
KF(T)
INTEGER Tv SY2v TSi A(T:)
REAL DD. DC. DMv DIIINi DPi D1l DZi DPRB, DLF(NTMAX)
DATA MAXP /10OO0OO/v MAXT /80/
DMIN /1.OE-5/
DATA Dl /1.)EC)/i
DZ /(.OEO/v
IFAULT = C:
DPRB
KFi
DZ

70
APPLIED STATISTICS
DM =
D1
DMIN
(
.LT. 2 . OR. N .GT. NTMAX:) IFULT
(T .LT2 2OR.
T GT. NTMAX .OR. T.
.
CIFAULT .NE. o) GOTO
(0)C0
KGP = C)
IF
IF
IF
KPR
C
C
GENERATE LOG FACTORIALS

=
DLF(1)
= DLF(I
10 DLF(:I')
DC = -FLOT(T:)
C
C
C
1:)
ALOG0FLOAT:I:))
ALOG(FLOAT(:N'.))
+ DLF(N)
DLF(T)
GENERATE PARTITIONS OF SAMPLE TOTAL (T:) STARTING WITH

M THE NUMBER OF NON-ZERO VALUES IN THE SAMPLE
M = MINOCTi N:)
DO 15 J = 1* M
15 KFCJ) = 0
20 DO 25 J = 1* M
25 ACJ) = 1
TS
M -
30 A(M: = T
DO 35 J = 2.
35
- ACJ
ACM) = A(M)
TS = TS + ACM:) **
KGP = KGP + 1
1)
THE CURRENT M-PART PARTITION
C
C
A(1)
AC2:
SY2)
GOTO 60
...
ACM)
OF T
T
WITH SUMS OF SQUARES

TS = SUM ACJ:) **
IS ADMISSIBLE
IF TS IS LESS THAN SY2
C
C
C
IF
C
C
C
CTS .GE.
CONVERT SAMPLE INTO FREQUENCY DISTRIBUTION.

KZ IS THE NUMBER OF ZERO VALUES IN THE SAMPLE.
KFCJ) IS THE NUMBER OF A( :)S EU.UAL TO J.
AND CONSTANT
DZ
DO 10 I = 2v NTMAX
= 2
GT. MAXT) IFAULT
K(Z = N - M
MX = ACM)
DO 60 J = 1. M
IAK = ACJ)
= KF(IAK)
KFCIAK)
60:) CONTINUE
COMPUTE DP = PROBABILITY
KPR = KPR
DD = DZ
IF
FOR THIS
SAMPLE
GT0. 0:) DD = DLF0(Z:
CKZ
IAK
KF(:1)
IAK
F(F(J)
IF (KFCI)
G0Th 0) DD = DD + DLFCIAK)
= 0
KF(1)
IF CMX .LT. 2:) GOTO 7C:)
DO 65 J = 2. MX
IF CKF(J) .EQ. 0) GOTO 65
DD = DD + DLF(IAK:)
K?F(J) = 0
65 CONTINUE
7C:) DP =
DPRB
IF
EXPCDC
= DPRB
(:DPRB .LT.
IFAULT
DPRB =
= 1
DM
GOTO 200t:
:30 CONTINUE
90
IT
IF
M -
CIT
EQ.
0)
FLOAT(CKFJ:))
DLF:J:)
DD)
DP
DM) GOTO 80
GOTO 120

71
STATISTICAL ALGORITHMS
DETERMINE NEXT M-PART PARTITION
UPDATE SUM OF SQUARES
C
C
C
IF (:(A(M) - AIT):)
IT - IT - 1
GOTO :f(
100) IAT = A(IT) + 1
GT. 1) GOTO 10O
Ml = M - 1
DO llo J = IT, Ml
B
TS - A(:J:) ** 2 + IAT **
TS
110
C
A(J)
CONTINUE
TB = TS -
C
C
IF (KGOP .LT.
IFAULT = 1
GOTO 200
C
C
C
C
M -
A(M) ** 2
. A(M - 1) ARE DETERMINED
NEW VALUES FOR A(:l:)
120:) M =
= IAT
MAXP) GOTO 30
END EVALUATION OF M-PART PARTITIONS.

DECREABE M BY 1 AND CONTINUE IF SY2
IS GREATER THAN THE SUM OF SQUARES
FOR THE LAST M-PART PARTITION.
IF (TS .GE. SY2) GOTO 200
IF (M .GT. 1) GOTO 20
200 PROB = DI - DPRB
RETURN
END
AS 172
Algorithm
Direct Simulationof Nested Fortran DO-LOOPS
By M.
O'FLAHERTY
and G. MAcKENZIE
Department
ofCommunity
Medicine,The Queen's University
ofBelfast,NorthernIreland
[ReceivedAugust1979. Final revisionJuly1981]
Keywords: K-DIMENSIONAL DO-LOOPS; SIMULATED SUBSCRIPTING; MULTI-DIMENSIONAL TABLES

LANGUAGE
Fortran66
INTRODUCTION
The Fortranlanguage does not provide a facilityto vary dynamicallythe depth of a

sequenceofnestedDO-LOOPS. Even ifsuch a facilityweregenerallyavailable its usefulness
mightbe limitedby the depth to which the different
dialects of Fortran permitarray
subscripting.
Yet,in statisticalapplications,thereare manycontextswherethealternatives
of
dynamicnestingare awkwardand inefficient.
SubroutineSIMDO providesa simpleand completelygeneralsystemforsimulatinga nest
of FortranDO-LOOPS to depthk. Gentleman(1975) describedan algorithm(AS 88) which
employeda simulatedDO-LOOP techniqueto generatethe nCrcombinations.The present
algorithm,whichin purpose and constructionis materiallydifferent
fromAS 88, generates
thecell subscriptsofa multi-dimensional
systematically
table,makingeach subscriptpattern
available to thecallingsegment.Apartfromobvious savingsin thedeclarationoffixedmultidimensionalarrays,the authorsanticipatethat the routinewill prove usefulin statistical
applicationswherethenestedDO-LOOPS are unconditionally
traversed.
? 1982 Royal StatisticalSociety
0035-9254/82/31071 $2.00


Fisher ExactFisher Exact Variance Test For The Poisson Distribution Variance Test For The Poisson Distribution

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Fisher ExactFisher Exact Variance Test For The Poisson Distribution Variance Test For The Poisson Distribution

Uploaded by

Copyright:

Available Formats

Algorithm AS 171: Fisher's Exact Variance Test for the Poisson Distribution

This content downloaded from 136.159.235.223 on Sat, 27 Dec 2014 01:11:39 AM

Fisher'sExact VarianceTest forthe Poisson Distribution

Keywords: INDEX OF DISPERSION; HOMOGENEITY OF VARIANCE; CHI-SQUARE

as a x2statisticwith(n- 1) degreesoffreedomwhenn is largeand

Partitionsof the sample total Tare generatedusinga modifiedversionof an algorithm

t Presentaddress:Mathsand StatisticsResearchDept.,Union CarbideCorporation,Oak Ridge,Tennessee,U.S.A.

This content downloaded from 136.159.235.223 on Sat, 27 Dec 2014 01:11:39 AM

The numberof partitionsof Tis approximatelyequal to (4T 1,/3)-'exp(Tc1/23/T),

Ifthewordsize ofthecomputeris 36 bitsor less,theusershouldspecifydouble precision

of thealgorithmcan be improvedby allocatingarraysA, KF and DLF for

This content downloaded from 136.159.235.223 on Sat, 27 Dec 2014 01:11:39 AM

This requiresadditionalerrorcheckingto ensurethatTdoes not

This reportis based on work performedunder Contract No. DE-AC05-760R00033

for a Poisson distribution.

EXACT VARIANCE TEST

PROE4 KOPv KPR,

FOR THE POISSON

This content downloaded from 136.159.235.223 on Sat, 27 Dec 2014 01:11:39 AM

GENERATE LOG FACTORIALS

GENERATE PARTITIONS OF SAMPLE TOTAL (T:) STARTING WITH

THE CURRENT M-PART PARTITION

WITH SUMS OF SQUARES

CONVERT SAMPLE INTO FREQUENCY DISTRIBUTION.

GT0. 0:) DD = DLF0(Z:

This content downloaded from 136.159.235.223 on Sat, 27 Dec 2014 01:11:39 AM

GT. 1) GOTO 10O

NEW VALUES FOR A(:l:)

END EVALUATION OF M-PART PARTITIONS.

Keywords: K-DIMENSIONAL DO-LOOPS; SIMULATED SUBSCRIPTING; MULTI-DIMENSIONAL TABLES

The Fortranlanguage does not provide a facilityto vary dynamicallythe depth of a

This content downloaded from 136.159.235.223 on Sat, 27 Dec 2014 01:11:39 AM

You might also like