Professional Documents
Culture Documents
Chapter 2
Kaplan-Meier Survival
Curves and the LogRank Test
Linda Staub & Alexandros Gekenidis
March 7th, 2011
1 Review
1 Review
Survivor function
= Pr( > )
2 Kaplan-Meier Curves
Example
The data: remission times (weeks) for two groups of
leukemia patients
Group 1 (n=21)
treatment
Group 2 (n=21)
placebo
6, 6, 6, 7, 10,
13, 16, 22, 23,
6+, 9+, 10+, 11+,
17+, 19+, 20+,
25+, 32+, 32+,
34+, 25+
1, 1, 2, 2, 3,
4, 4, 5, 5,
8, 8, 8, 8,
11, 11, 12, 12,
15, 17, 22, 23
+ denotes censored
Group 1
Group 2
# failed
# censored
Total
9
21
12
0
21
21
Descriptive statistic:
qj
0
6
7
10
13
16
22
23
>23
0
1
1
2
0
3
0
5
-
21
21
17
15
12
11
7
6
-
Group 1 (treatment)
6, 6, 6, 7, 10,
13, 16, 22, 23,
6+, 9+, 10+, 11+,
17+, 19+, 20+,
25+, 32+, 32+,
34+, 25+
+ denotes
censored
0
3
1
1
1
1
1
1
-
Group 2 (placebo)
1, 1, 2, 2, 3,
4, 4, 5, 5,
8, 8, 8, 8,
11, 11, 12, 12,
15, 17, 22, 23
Group 2 (placebo)
nj
t( j )
mj
qj
0
1
2
3
4
5
8
11
12
15
17
22
23
0
0
0
0
0
0
0
0
0
0
0
0
0
21
21
19
17
16
14
12
8
6
4
3
2
1
0
2
2
1
2
2
4
2
2
1
1
1
1
t( j )
nj
mj
qj
21
21
19/21 = .90
19
17/21 = .81
17
16/21 = .76
16
14/21 = .67
14
12/21 = .57
12
8/21 = .38
11
6/21 = .29
12
4/21 = .19
15
3/21 = .14
17
2/21 = .10
22
1/21 = .05
23
0/21 = .00
# ()
=
21
t( j )
nj
mj
qj
21
21
19/21 = .90
19
17/21 = .81
17
16/21 = .76
16
14/21 = .67
14
12/21 = .57
12
8/21 = .38
11
6/21 = .29
12
4/21 = .19
15
3/21 = .14
17
2/21 = .10
22
1/21 = .05
23
0/21 = .00
# ()
=
21
t( j )
nj
mj
qj
21
21
19/21 = .90
19
17/21 = .81
17
16/21 = .76
16
14/21 = .67
14
12/21 = .57
12
8/21 = .38
11
6/21 = .29
12
4/21 = .19
15
3/21 = .14
17
2/21 = .10
22
1/21 = .05
23
0/21 = .00
# ()
=
21
t( j )
nj
mj
qj
21
21
19/21 = .90
19
17/21 = .81
17
16/21 = .76
16
14/21 = .67
14
12/21 = .57
12
8/21 = .38
11
6/21 = .29
12
4/21 = .19
15
3/21 = .14
17
2/21 = .10
22
1/21 = .05
23
0/21 = .00
# ()
=
21
t( j )
nj
mj
qj
21
21
19/21 = .90
19
17/21 = .81
17
16/21 = .76
16
14/21 = .67
14
12/21 = .57
12
8/21 = .38
11
6/21 = .29
12
4/21 = .19
15
3/21 = .14
17
2/21 = .10
22
1/21 = .05
23
0/21 = .00
# ()
=
21
General KM formula
> () ()
=1
= (1) > () ()
Proof: blackboard
nj
mj
qj
()
21
21
118/21=.8571
17
.857116/17=.8067
10 15
.806714/15=.7529
13 12
.752911/12=.6902
16
11
.690210/11=.6275
22
.62756/7=.5378
23
.53785/6=.4482
Fraction at () :
Pr > () () )
Not available at t j
failed prior to t j
Censored prior to
t j
nj
mj
qj
()
21
21
118/21=.8571
17
.857116/17=.8067
10 15
.806714/15=.7529
13 12
.752911/12=.6902
16
11
.690210/11=.6275
22
.62756/7=.5378
23
.53785/6=.4482
Fraction at () :
Pr > () () )
Not available at t j
failed prior to t j
Censored prior to
t j
nj
mj
qj
()
21
21
118/21=.8571
17
.857116/17=.8067
10 15
.806714/15=.7529
13 12
.752911/12=.6902
16
11
.690210/11=.6275
22
.62756/7=.5378
23
.53785/6=.4482
Fraction at () :
Pr > () () )
Not available at t j
failed prior to t j
Censored prior to
t j
nj
mj
qj
()
21
21
118/21=.8571
17
.857116/17=.8067
10 15
.806714/15=.7529
13 12
.752911/12=.6902
16
11
.690210/11=.6275
22
.62756/7=.5378
23
.53785/6=.4482
Fraction at () :
Pr > () () )
Not available at t j
failed prior to t j
Censored prior to
t j
nj
mj
qj
()
21
21
118/21=.8571
17
.857116/17=.8067
10 15
.806714/15=.7529
13 12
.752911/12=.6902
16
11
.690210/11=.6275
22
.62756/7=.5378
23
.53785/6=.4482
Fraction at () :
Pr > () () )
Not available at t j
failed prior to t j
Censored prior to
t j
= censoring time
= min(, )
= 1*+
()(1 ())
()(1 ())
=1
=1
=1
And satisfies
= max , where is the space of all survival functions
One can show that the Kaplan-Meier estimator maximizes the likelihood
KM-estimator is the NPMLE
# in risk set
t j
m1 j m2 j
n1 j
n2 j
1
2
3
4
5
6
7
8
10
11
12
13
15
16
17
22
23
0
0
0
0
0
3
1
0
1
0
0
1
0
1
0
1
1
21
21
21
21
21
21
17
16
15
13
12
12
11
11
10
7
6
21
19
17
16
14
12
12
12
8
8
6
4
4
3
3
2
1
2
2
1
2
2
0
0
4
0
2
12
0
1
0
1
1
1
e1 j
n1 j
n n
2j
1j
Proportion
in risk set
e2 j
n2 j
n n
2j
1j
m1 j m2 j
# of failures
over both
groups
m1 j m2 j
Oi Ei
# failure times
j 1
ij
eij
O1 E1 10.26
O2 E2 10.26
Log-rank statistic
O2 E2 2
=
Var O2 E2
O2 E2 2
Var O2 E2
12
2 2
2 2
R-code
> time <c(6,6,6,7,10,13,16,22,23,6,9,10,11,17,19,20,25,32,32,34,35,1,1,2,2,3,4,4,5,5,8,8,8,8,11,11,
12,12,15,17,22,23)
> status <c(1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
> treatment <c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2)
> fit <- survdiff(Surv(time, status) ~ treatment)
Result
> fit
Call:
survdiff(formula = Surv(time, status) ~ treatment)
N Observed Expected (O-E)^2/E (O-E)^2/V
treatment=1 21
9
19.3
5.46
16.8
treatment=2 21
21
10.7
9.77
16.8
Chisq = 16.8 on 1 degrees of freedom, p = 4.17e-05
Remarks
Tarone-Ware
Peto
Flemington-Harrington
Weighting the
Test statistic:
Weight at jth
failure time
Remarks
Choosing a Test
Remission data
Stratified variable: 3-level variable (LWBC3) indicating
low, medium, or high log white blood cell count (coded 1,
2, and 3, respectively)
Treated Group: rx = 0
Placebo Group: rx = 1
Recall: Non-stratified test 2 -value of 16.79
and corresponding p-value rounded to 0.0000
R-code
> data <- read.table("http://www.sph.emory.edu/~dkleinb/surv2datasets/anderson.dat")
> lwbc3 <c(1,1,1,2,1,2,2,1,1,1,3,2,2,2,2,2,3,3,2,3,3,1,2,2,1,1,3,3,1,3,3,2,3,3,3,3,2,3,3,3,2,3)
> fit <- survdiff(Surv(data$V1,data$V2)~data$V5+strata(lwbc3))
Result
> fit
Call:
survdiff(formula = Surv(data$V1, data$V2) ~ data$V5 + strata(lwbc3))
N Observed Expected (O-E)^2/E (O-E)^2/V
data$V5=0 21
9
16.4
3.33
10.1
data$V5=1 21
21
13.6
4.00
10.1
Chisq = 10.1
References
KLEINBAUM, D.G. and KLEIN, M. (2005).
Survival Analysis. A self-learning text.
Springer.
MAATHUIS, M. (2007). Survival analysis for
interval censored data. Part I.