You are on page 1of 34

LINEAR STATISTICAL MODELS

SYS 4021

Project 3:
Design Improvements for
the University of Virginia Transplant
Center
Donald E. BROWN
brown@virginia.edu

Laura BARNES
lb3dp@virginia.edu

Summary
This study considers the number of kidney and liver transplants at UVA and comes up with an evaluation for
these organic transplants with the MCV and Duke center overall and in different ethnic group especially for
minorities. UVA has the smallest trend on the number of kidney transplants overall and in non-white group
as compared to the two centers over the period 1988 2012. The t-test shows that there is a difference
between the number of transplants at UVA and the other centers at 5% level. The 95% bootstrap confidence
interval of the mean difference also indicates that I can reject the null hypothesis of mean difference is zero.
Time series linear model is constructed to predict the mean difference between two centers in 2013.The
results from Bootstrap and Monte-Carlo simulation reveals that the 95% prediction confidence interval does
not contain zero, meaning that there is a difference between the prediction numbers of kidney transplants.
The negative confidence interval tells that the predicted number of kidney transplants at UVA overall and for
non-whites in 2013 is less than the predicted number of kidney transplants at MCV and Duke. This suggests
UVA to do better at recruiting people overall and at recruiting people from other ethnicities. For liver
transplants, it is hard to conclude that building the new Roanoke center in 2005 has increased the number of
liver transplants at UVA. Linear model and Poisson model to model the number of liver transplants show
contradict results. Based on linear model with time series, the p-value of Roanoke variable is 0.014 and is
less than 0.05. With this model, I can reject the null hypothesis at 5% and conclude that building the
Roanoke center has increased the number of liver transplants. Meanwhile based on Poisson model with time
series, Roanoke variable does not affect the number of liver transplants and is not significant at 5% level.
This suggests UVA to do more research on liver transplants and it may be interesting to collect data at UVA
C-ville and UVA Roanoke center.
Honor pledge: On my honor, I pledge that I am the sole author of this paper and I have accurately cited all
help and references used in its completion.

Imran A. Khan
December 7, 13

1. Problem Description
1.1. Situation
Organ transplantation replaces diseased or damaged organs with functioning organs from either deceased or
living donors. The complexity of these procedures requires highly skilled teams of physicians, nurses, and
support staff as well as facilities for the surgery and recovery. According to research from the United States,
the need for organ donation has become a growing concern over the last decade as the gap between organ
donors and those awaiting transplants widens [4].
The University of Virginia has conducted organ transplantation for more than 30 years and now provides
services for kidney, pancreas, liver, islet, heart, and lung transplantation [2]. The UVA Health System is
consistently ranked as one of the Top 100 Hospitals in America [5]. The availability of first-rate
transplantation services is a component of these rankings. The Transplant Center desires to continue to
increase the number of transplants in all categories but needs guidance on how to achieve this goal [2].
Organ transplantation processes have five primary steps [2]:
1. Referral from a primary care physician;
2. Determination of eligibility and placement on a waiting list;
3. Matching of donor organ with the patient;
4. Acceptance of the organ by the transplant center;
5. Transplantation surgery and recovery.
Figure 1 displays the total number of kidney transplants and donors from the 11 geographic regions. It shows
that the number of transplants is always higher than the number of donors. The center with the most recent
kidney transplants is MCV and the center with the least kidney transplants is UVA and UNC in the last few
years. Duke has done more kidney transplants in 2000 2005 than all other centers.
Figure 1. Plot Kidney Transplants

Table 1 shows that MCV center has the highest number of kidney transplants in a single year while UVA
and UNC center has the lowest number of kidney transplants in a single year. On average, DUKE center has
significantly more kidney transplants than other centers.
Table 1. Summary statistics on number of kidney transplants by center/region
Statistics

UVA

UNV

MCV

Duke

R11Donor

Most kidney transplants

107

90

137

121

1337

Least kidney transplants

24

24

29

43

456

Kidney transplants in 2012

68

90

136

74

1304

Average per year

65.8

54.64

75.64

81.76

956.52

Standard error of the mean

4.16

3.3

7.24

4.88

57.08

MCV appears to have the largest trend in kidney transplants and UVA appears to have the smallest trend in
kidney transplants over the period 1988 2012. UVA and Duke have nearly parallel trend in kidney
transplants over the period of time (Figure 2).
Figure 2. Kidney transplants trend over the period of time by centers

Taking into account the ethnic group, I can observe that UVA performs kidney transplants better for white
people and MCV performs better for minorities or non-whites (Figure 3). This means that UVA center has to
improve its performance of kidney transplants for non-whites. Meanwhile, Duke center performs better in
the beginning of the year for white people but then at the end of the year more kidney transplants for nonwhite people than for white people. Comparing the number of kidney transplants of non-whites between the
other centers, I can see that UVA has the poorest performance with the smallest trend and MCV has the
greatest performance with the largest trend (Figure 4).

Figure 3. Plot of kidney transplants by ethnic group for each center

Figure 4. Plot of kidney transplants by center for each ethnic group

Figure 5. Scatterplot matrix for kidney transplants between centers and region

Figure 5 shows that the total number of kidney transplants, donors, and the number of kidney transplants in
each center has a strong positive correlation. This can be easily explained as all these numbers have been
increasing over the last 10 years.
Unlike kidney transplants, there are more liver donors than liver transplants as shown in Figure 6. In general,
the four centers fluctuate between decreasing and increasing over the year. UVA reaches the highest liver
transplants in 2009 but then it drops in 2010. In addition, all centers perform well for liver transplants of
white people than non-white people (Figure 7 & 8).
Figure 6. Plot of liver transplants

Figure 7. Plot of liver transplants by ethnic group for each center

Figure 8. Plot of liver transplants by center for each ethnic group

Table 2 summarizes the statistics on number of liver transplants. On average, UVA has more liver
transplants than the other centers. Also, it has the highest liver transplants in 2012.
Table 2. Summary statistics on number of liver transplants
Statistics

UVA

UNV

MCV

Duke

Most liver transplants

87

73

66

67

Least liver transplants

16

11

Liver transplants in 2012

68

31

60

67

Average per year

46.52

38.68

45.24

38.08

Standard error of the mean

4.575

4.398

2.815

2.542

Based on the above situation, I can see that MCV and Duke do better job than UVA for the overall kidney
transplants. Also, UVA has the poorest performance of kidney transplants of non-whites as compared to the
other centers. This motivates me to compare kidney transplants at UVA with MCV, since MCV has the
largest trend or a stable increase overall. Also, MCV has similarity in demographics as UVA, so MCV is a
good choice to compare the kidney transplants overall and in non-white group. In addition, it is also my
interest to compare between UVA and Duke since they both have similar trend over the period of time.
There is no need to compare the number of liver transplants between UVA and the other centers since UVA
has shown a good performance over the period of time. But it should be noted that UVA built a new center in
Roanoke in 2005. As shown in Figure 6, I see that the number of liver transplants at UVA tends to increase a
lot from 2005. So it is my interest to see if building the new center has increased the number of liver
transplants.

1.2. Goal
The aim of this study is to come up with a new design that could potentially increase the number of kidney
and liver transplants at UVA. In terms of kidney transplants, the goal is to figure out on how to increase or
improve the number of transplants overall and in non-white group by comparing to MCV and Duke centers.
In terms of liver transplants, the goal is to analyze the efficiency in increasing the number of transplants
when the new center Roanoke was built in 2005.
1.3. Metrics
For kidney transplants analysis, the difference between the number of kidney transplants at UVA and MCV
and at UVA and Duke are used as a response variable in linear regression model. For liver transplants
analysis, the number of liver transplants at UVA is used as a response variable in linear regression model.
Adjusted R2, AIC, and MSE are considered to compare the performance between the fitted models. In
addition to liver transplant analysis, Poisson model is also considered. The Diagnostic plots (Normal Q-Q
plot, Residuals vs. Fitted, Residuals vs. Leverage, Scale-Location), Autocorrelation function (ACF), and
Partial Autocorrelation function (PACF) are used to investigate if autoregressive term is needed in the
model. Bootstrapping method is also applied to estimate the regression parameters with B=2000 as the
number of bootstrap samples. This includes constructing 95% percentile and BCa confidence interval.
I use significance level of 5% for the analysis throughout this study. If the confidence level (p-value) is less
than 0.05, then my (null) hypothesis is rejected in favor of the alternative. Alternatively, if p-value is greater
than 0.05, then my null hypothesis should not be rejected.
1.4. Hypotheses
Hypothesis 1:
H0: There is no difference between number of kidney transplants at UVA and MCV in 2013
H1: There is a difference between number of kidney transplants at UVA and MCV in 2013
Hypothesis 2:
H0: There is no difference between number of non-white kidney transplants at UVA and MCV
H1: There is a difference between number of non-white kidney transplants at UVA and MCV
Hypothesis 3:
H0: There is no difference between number of kidney transplants at UVA and Duke in 2013
H1: There is a difference between number of kidney transplants at UVA and Duke in 2013
Hypothesis 4:
H0: There is no difference between number of non-white kidney transplants at UVA and Duke
H1: There is a difference between number of non-white kidney transplants at UVA and Duke
Hypothesis 5:
H0: Building the Roanake center does not increase the number of liver transplants at UVA
H1: Bulding the Roanake center increases the number of liver transplants at UVA

2. Approach
2.1. Data
The data for this study comes from the Organ Procurement and Transplantation (OPTN) [1] for 4 transplant
centers and two region US, region 11 (Kentucky, North Carolina, South Carolina, Tennessee and Virginia).
Each of regions has number of transplants performed at center/ region, background of the patients (age and
ethnic group) There are 15 databases related to organ transplant data in csv (comma-separated values)
format, i.e. USdonor.csv, UStransplant.csv, R11donor.csv, R11xplant.csv, UVAxplant.csv, Dukexplant.csv,
MCVxplant.csv, UNC.csv, MCVage.csv, MCVethnic.csv, R11age.csv, R11ethnic.csv, UVAage.csv, and
UVAethnic.csv.
In total, there are 26 observations in each data showing the number of organ transplants from 1988 to 2013.
In this study, I only consider the number of kidney and liver transplants at both region and center. The series
of number kidney and transplants over the last 26 years are tabulated in Table A1 and A2 (Appendix A).
There are no missing values in the data but it should be noted the 26th observation (2013) is not included in
the analysis since the data are incomplete for that year. For liver transplant data at UVA center, there is one
additional variable, i.e. binary variable, indicates before (0) and after (1) building the Roanoke center.
2.2. Analysis
2.2.1.

Kidney Transplants Analysis

To predict the difference of kidney transplant between UVA and the other centers in 2013, linear regression
model is built by using R software. Before I start with model building, I make a time series plot of kidney
transplants for each center and also a time series plot of the difference of kidney transplant at UVA and the
other centers (MCV and Duke). A classical paired t-test is also performed to see if there is a difference of
kidney transplants from 1988 to 2012. This test works best on normally distributed data, thus if the
assumption of normality is violated, bootstrapping method is then applied.
In order to reject my hypotheses related to kidney transplants, I build 4 linear regression models by using
r11donor variable to control for Region 11:
Model 1: ( ) = 0 + 1 11_ +
Model 2: ( ) = 0 + 1 11_ +
Model 3: ( ) = 0 + 1 11_ +
Model 4: ( ) = 0 + 1 11_ +
where:
- is the number of kidney transplants at UVA,
- is the number of kidney transplants at MAC,
- is the number of kidney transplants at Duke,
- is the number of kidney transplants of non-whites at UVA,
7

- is the number of kidney transplants of non-whites at MCV,


- is the number of kidney transplants of non-whites at Duke,
- 11_ is the number of kidney transplants at region 11, and
- 11_ is the number of kidney transplants of non-whites at region 11.
The stages of model building are as follows:
1. Fit model 1 4 and check the residuals of the fitted model whether they are correlated or not by looking
at ACF and PACF plot. Also, observe graphical diagnostic plots to investigate how well the regression
assumptions are satisfied.
2. Determine the number of autoregressive (AR) terms needed to model the residuals by examining the
AIC plot.
3. Fit time series regression models, i.e. model 1 4 after accounting AR terms in the model, and then
examine diagnostic plots and ACF and PACF plots of the residuals of the time series regression models.
4. Asses the fitted model before and after accounting AR terms by using adjusted R2, AIC, and MSE
criteria.
5. Apply bootstrapping method to the time series regression model to estimate standard error and
confidence interval of the parameters.
6. Predict the difference of kidney transplants at UVA and other centers in 2013 from the time series
regression model by utilizing bootstrap and Monte-Carlo simulation approach.
2.2.2.

Liver Transplants Analysis

In order to answer my hypothesis related to liver transplants, I start with performing a classical two sample ttest to see if there is a difference between the number of liver transplants before (1988 2004) and after
(2005 2012) building the Roanoke center. A non-parametric t-test is also performed, .i.e. Wilcoxon test, in
case normal assumption is violated under t-test. Time series of liver transplants is also considered in this
analysis. The residuals of the time series model is then used in t-test and Wilcoxon test to compare the
number of liver transplants before and after building the Roanoke center. Furthermore, I consider linear
model and Poisson model to test whether the number of liver transplant increases after building the Roanoke
center by controlling for Region 11 in the model.
The stages of building linear model are as follows:
1. Consider a linear model with Region 11 and Roanoke center as predictor variables :
Model 5: = 0 + 1 11 + 2 +
1,
where = {
.
0,

2. Examine diagnostic plot to check if regression assumptions are satisfied.


3. Examine ACF and PACF plots to check if the residuals series are correlated.
8

4. Use AIC plot to determine the number of AR terms needed.


5. Asses the fitted model before and after accounting AR terms by using adjusted R2, AIC, and MSE.
6. Apply bootstrapping method to the time series regression model to estimate standard error and
confidence interval of the coefficient value of Roanoke center.
The stages of building Poisson model are as follows:
1. Consider a Poisson model with Region 11 as a predictor variable:
Model 6.1: log() = 0 + 1 11_
2. Examine diagnostic plot to check if regression assumptions are satisfied. Also, examine ACF and PACF
plots to check if the residuals series are correlated.
3. Check the dispersion of the fitted Poisson model. If the dispersion is not close to 1 then consider a Quasi
Poisson model.
4. Consider a (Quasi) Poisson model with time series component:
Model 6.2: log() = 0 + 1 11_ + 1 1
5. Examine diagnostic plot and ACF and PACF plot of the fitted model in 4.
6. Add Roanoke variable in the (Quasi) Poisson model with time series:
Model 6.3: log() = 0 + 1 11_ + 2 + 1 1
7. Examine diagnostic plot and ACF and PACF plot of the fitted model in 6.
8. Perform chi square test between the time series (Quasi) Poisson model with and without Roanoke center.

3. Evidence
3.1. Kidney Transplants
Figure 9 shows the series of kidney transplants at UVA, MCV, and Duke from 1988 to 2012 and their
corresponding ACF and PACF plots. The series plot shows non-stationary and the correlogram (ACF) shows
that the series are correlated since lags 1-2 are significant for kidney transplants at UVA, lags 1-5 are
significant for kidney transplants at MCV, and lags 1-3 are significant for kidney transplants at Duke.
Similar result is observed for series of kidney transplants of non-whites as shown in Figure 10: ACF plot
shows sinusoidal decay and PACF plot cuts off after lag 1.

Figure 9. Kidney transplants series and its corresponding ACF and PACF plots

Figure 10. Kidney transplants series of non-whites and its corresponding ACF and PACF plots

10

A classical paired t-test is performed between the number of kidney transplants at UVA and other centers
from 1988 to 2012. The result is summarized in Table 3. On average, the number of kidney transplants at
UVA is smaller than the other centers and thus I have a negative mean difference. The test shows that I can
reject the null hypothesis, meaning there is a difference between the number of kidney transplants of nonwhite people at UVA and MCV and at UVA and Duke at 5% level. The p-values are smaller than 0.0001.
But for the overall number of kidney transplants at UVA and MCV, the test shows that the difference is zero
because I do not have a strong evidence to reject the null hypothesis at 5% level (p-value = 0.079). One of
the reason I cannot reject the null hypothesis is that the sample size is very small, only 25 observations. Also,
t-test may not be valid to the data because it is a parametric method that relies on an assumption of normal
distribution of the data. The histogram and the QQ plot as shown in Figure B1 (Appendix B) shows that the
difference between kidney transplants at UVA and other centers are not normal.
Table 3. Paired t-test on number of kidney transplants at UVA and other centers
Meana

Meanb

Mean
Difference

t-test

DF

p-value

65.8

75.64

-9.84

-1.834

24

65.8

81.76

-15.96

-3.828

17.08

49.88

-32.80

17.08

40.32

-23.24

95% CI
Lower

Upper

0.079

-20.913

1.233

24

0.001

-24.565

-7.355

-7.751

24

<0.0001

-41.534

-24.066

-8.918

24

<0.0001

-28.618

-17.862

Overall kidney transplants


UVAa vs. MCVb
a

UVA vs. Duke

Non-whites kidney transplants


UVAa vs. MCVb
a

UVA vs. Duke

Therefore, I apply bootstrap method to the data and the result is shown in Table 4. Unlike previous result, the
95% confidence interval of the difference of kidney transplant at UVA and MCV now does not contain zero.
This means that there is a difference between overall number of kidney transplants at UVA and MCV at 5%
level. The histogram of the bootstrap estimate on the difference of kidney transplants is normally distributed
(Figure B2-B5, Appendix B).
Table 4. Bootstrap estimate on kidney transplants difference and confidence interval
Standard
Original
95% Percentile CI
error

95% BCa CI

Overall kidney transplants


UVA vs. MCV

-9.84

5.317

(-20.440, -0.001 )

(-21.138, -0.253 )

UVA vs. Duke

-15.96

4.061

(-24.12, -8.28 )

(-24.26, -8.41 )

UVA vs. MCV

-32.80

4.164

(-41.20, -25.08 )

(-41.85, -25.24 )

UVA vs. Duke

-23.24

2.561

(-28.60, -18.24 )

(-28.72, -18.44 )

Kidney transplants for non-whites

Figure 11 depicts the above analysis. I can observe that there is a difference between kidney transplants at
UVA and other centers over the year. The difference between kidney transplants of non-whites at UVA and
MCV and Duke tend to decrease over the year. Meanwhile, the difference based on the overall number of
kidney transplants fluctuates between increasing and decreasing.

11

Figure 11. Plot of difference between kidney transplants at UVA and other centers

Figure 12 displays the ACF and PACF plot of difference of number of kidney transplants at UVA and MCV
and at UVA and Duke. It shows that the difference series are correlated and it may be necessary to consider
autoregressive model.
Figure 12. ACF and PACF plot of difference of number of kidney transplants

Table 5 summarizes the estimate, standard error, and p-value of the regression parameters for Model 1 4.
The overall model is significant at 5% except Model 2. Diagnostic plot indicates that the regression
assumptions are violated. I can observe the model has non-constant variance and lack of fit based on the
residual v.s fitted plot. Also, the residuals do not follow Gaussian based on the QQ plot (Figure 13-16).

12

Table 5. Estimate and standard error of linear model with mean difference of kidney transplants as the response

0
1
Overall
model

Model 1
Estimate
P-value
(se)
43.079
0.012
(15.789)
-0.055
0.002
(0.016)
F-statistic: 12.19 on 1
and 23 DF, p-value:
0.001967

Model 2
Estimate
P-value
(se)
-7.042
0.644
(15.054)
-0.009
0.543
(0.015)
F-statistic: 0.3809 on
1 and 23 DF, p-value:
0.5432

Model 3
Estimate
P-value
(se)
0.642
0.886
(4.446)
-0.262
<0.001
(0.031)
F-statistic: 73.13 on
1 and 23 DF, pvalue: 1.34e-08

Model 4
Estimate
P-value
(se)
-9.281
0.051
(4.514)
-0.109
0.002
(0.031)
F-statistic: 12.36 on 1
and 23 DF, p-value:
0.001857

Figure 13. Diagnostic plots for model 1

Figure 14. Diagnostic plots for model 2

13

Figure 15. Diagnostic plots for model 3

Figure 16. Diagnostic plots for model 4

ACF and PACF plots of residuals for Model 1 4 show that the series correlated and it may be necessary to
consider AR(1) to model the residuals since the sample PACF plot has insignificant peak after lag 1,
especially for Model 1 and 3 (Figure 17). AIC plot for the different number of lag as shown in Figure 18
indicates that it may be adequate to consider AR(1) for residuals of Model 1 and 3 and AR(4) for residuals of
Model 2. I dont need to consider AR term for residuals of Model 4 since PACF plot show no significant lag.

14

Figure 17. ACF and PACF plots of linear model residuals

Figure 18. AIC plot for AR model

My final linear model after accounting AR terms can be re-written as follows:


Model 1: ( ) = 0 + 1 11_ + 1 1 +
Model 2: ( ) = 0 + 1 11_ + 1 1 +
Model 3:( ) = 0 + 1 11_ + 1 1 + 2 2 +
3 3 + 4 4 +
Model 4: ( ) = 0 + 1 11_ +

15

Table 6. Estimate and standard error of linear model after accounting autoregressive terms

0
1
1

Model 1
Estimate
P-value
(se)
58.061
<0.001
(12.022)
-0.07
<0.001
(0.012)
0.708
<0.001
(0.157)

2
3
4
Overall
Model

F-statistic: 25.69 on 2
and 21 DF, p-value:
2.279e-06

Model 2
Estimate
P-value
(se)
-2.046
0.906
(17.092)
-0.013
0.425
(0.016)
0.577
0.024
(0.229)
-0.284
0.310
(0.271)
0.072
0.791
(0.267)
-0.473
0.045
(0.217)
F-statistic: 3.79 on 5
and 15 DF, p-value:
0.0203

Model 3
Estimate
P-value
(se)
0.523
0.911
(4.65)
-0.262
<0.001
(0.031)
0.317
0.154
(0.214)

Model 4
Estimate
P-value
(se)
-9.281
0.051
(4.514)
-0.109
0.002
(0.031)

F-statistic: 36.17 on
2 and 21 DF, pvalue: 1.577e-07

F-statistic: 12.36 on 1
and 23 DF, p-value:
0.001857

Table 6 summarizes the estimated parameters of time series regression models. The overall model is
significant at 5% level for the four models. The estimated parameters for Model 4 are still the same as in
Table 5 since no autoregressive term is considered. Diagnostic plots for Model 1 3 after accounting AR
terms show moderate violation of constant variance and Gaussian distribution. ACF plot shows no serial
correlations. It appears that I have accounted for everything based on those plots as shown in Figure 19-22.
Figure 19. Diagnostic plot of model 1 after accounting AR terms

16

Figure 20. Diagnostic plot of model 2 after accounting AR terms

Figure 21. Diagnostic plot of model 3 after accounting AR terms

Figure 22. ACF and PACF plots of residuals of model 1-3 after accounting AR terms
Model 2

Model 1

Model 3

17

I also perform bootstrapping method to the time series regression models. I have similar results as obtained
in Table 6. The confidence interval for the regression parameters of Model 1 do not contain zero as shown in
Table 7. This is similar as t-test for the time series regression model with p-values < 0.0001, meaning to
reject both the null hypothesis that the coefficient of r11donor and AR(1) equal to zero.
Table 7. Bootstrap results on time series regression model
Original

Bias

Std. Error

95% Percentile CI

95% BCa CI

0 *

58.06

-0.0921

10.767

(37.0800, 77.770 )

(36.120, 77.310 )

1 *

-0.07

0.0002

0.011

(-0.0906, -0.0500 )

(-0.0913, -0.0505 )

1 *

0.71

0.0007

0.151

( 0.4172, 1.0109 )

( 0.4210, 1.0171 )

0 *

-2.05

0.0399

14.321

(-30.449, 25.506 )

(-31.879, 23.051 )

1 *

-0.01

0.0000

0.013

(-0.0384, 0.0124 )

(-0.0398, 0.0119 )

2 *

0.58

0.0050

0.196

( 0.2057, 0.9814 )

( 0.2123, 0.9888 )

3 *

-0.28

-0.0068

0.234

(-0.7543, 0.1728 )

(-0.7549, 0.1686 )

4 *

0.07

0.0089

0.227

(-0.3731, 0.5344 )

(-0.3936, 0.5158 )

5 *

-0.47

-0.0047

0.185

(-0.8412, -0.1029 )

(-0.8115, -0.0784 )

0 *

0.52

0.0593

4.275

(-7.8556, 9.0951 )

(-7.5756, 9.3440 )

1 *

-0.26

-0.0002

0.029

-0.3215, -0.2049 )

(-0.3185, -0.2028 )

1 *

0.32

0.0046

0.200

(-0.0877, 0.7176 )

(-0.0979, 0.7044 )

0 *

-9.28

0.0439

4.315

(-17.351, -0.625 )

(-16.651, 0.167 )

1 *

-0.11

0.0001

0.030

(-0.1672, -0.0522 )

(-0.1672, -0.0521 )

Model 1

Model 2

Model 3

Model 4

Based on adjusted R2, AIC, MSE criteria, regression model after accounting AR terms performs better than
the model before accounting AR terms. Adjusted R2 is higher for time series regression model. It means
modeling the residuals with autoregressive model has improved the fitted model. AIC and MSE values are
also smaller for model after accounting AR terms (Table 8).
Table 8. Model assessment based on adjusted R2, AIC, and MSE
Before accounting AR terms

After accounting AR terms

adj. R2

AIC

MSE

adj. R2

AIC

MSE

Model 1

0.318

229.76

451.50

0.682

204.23

208.17

Model 2

-0.026

227.38

410.44

0.411

183.78

189.95

Model 3

0.750

192.77

102.82

0.754

185.88

96.89

Model 4

0.321

193.54

106.02

0.321

193.54

106.02

18

My final model in Table 6 can be used to predict the mean difference of kidney transplants in 2013. To do
this, I need to forecast the r11 donors and the model residuals. This is done by forecasting r11donor series in
2013 and then I use this point forecast to predict the time series regression model. Prediction result with
bootstrap method is shown in Table 9. I also use Monte-Carlo simulation for improved CI and the result is
shown in Table 10. Comparing the two results, i.e. bootstrap and simulation prediction, I can see that MonteCarlo simulation prediction tends to have wider confidence interval. Also, in general the two methods show
that I am doing better than my estimate already since the predication estimate from bootstrap and simulation
is smaller (Figure 23). The bootstrap and simulation prediction do not show any deviation from normal
distribution (Figure 24-25). The 95% confidence interval for the predicted mean difference of kidney
transplants at UVA and MCV and at UVA and DUKE do not contain zero. This means there is indeed a
difference between the two centers at 5% level. Similar results are also obtained for non-whites kidney
transplants. Therefore, I can reject my null hypotheses of no mean difference in 2013 with p<0.025.
Table 9. Prediction of mean difference at UVA and other centers in 2013 using bootstrap method
Point
Forecast of
r11donor

Prediction of
mean difference
in 2013

Std.
Error

Bootstrap Prediction
95% Percentile
95% BCa
CI
CI

Model 1: UVA - MCV

1264.4

-45.42

5.534

(-57.58, -35.30 )

(-58.43, -36.20 )

Model 2: UVA Duke

1264.4

-32.59

5.410

(-43.78, -22.10 )

(-44.04, -22.26 )

Model 3: UVA MCV NW

210.4

-50.44

4.113

(-58.30, -42.41 )

(-58.07, -42.25 )

Model 4: UVA Duke NW

210.4

-32.31

3.180

(-38.28, -25.74 )

(-37.77, -25.10 )

Table 10. Prediction of mean difference at UVA and other centers in 2013 using simulation method
Point
Forecast of
r11donor

Prediction of
mean difference
in 2013 (Median)

Std.
Error

Simulation Prediction
95% Percentile
95% BCa
CI
CI

Model 1: UVA - MCV

1264.4

-44.77

10.970

(-68.14, -25.20 )

(-29.93, -12.62 )

Model 2: UVA Duke

1264.4

-32.66

5.885

(-45.82, -21.97 )

(-45.37, -21.90 )

Model 3: UVA MCV NW

210.4

-50.40

8.970

(-69.24, -32.84 )

(-72.17, -35.75 )

Model 4: UVA Duke NW

210.4

-31.95

4.674

(-42.67, -24.50 )

(-29.23, -19.60 )

19

Figure 23. Forecast plot

Figure 24. Histogram and QQ-plot of bootstrap prediction

20

Figure 25. Histogram and QQ-plot of simulation prediction

3.2.Liver Transplants
Figure 26 plots the number of liver transplants before and after building the Roanoke center. It appears that
there are more liver transplants when the Roanoke center is built. Starting from year 2005, there is an
increasing trend although there is a drop in year 2011. The right panel of Figure 26 shows the distribution of
liver transplants which is non-normal. Table 11 confirms my plot that the mean of liver transplants is higher
(68.13) after the Roanoke center is built. T-test indicates that there is a difference between the number of
liver transplants before and after the Roanoke center (p-value = 0.0013). I also utilize Wilcoxon test because
the sample size is very small and also because the data are not normally distributed (Figure B6, Appendix B).
The result is in line with t-test that I can reject the null hypothesis of the difference is equal to zero at 5%
level.
Figure 26. Plot and histogram of number of liver transplants

21

Table 11. T-test and Wilcoxon test on number of liver transplants before and after the Roanoke center
Mean before
T-test
Wilcoxon test
Mean after the
the Roanoke
Roanoke center
center
t-stat
DF
p-value
W-stat
p-value
36.35

68.13

-4.090

12.749

0.0013

12.5

0.0013

ACF plot of liver transplants indicates that the series are correlated and it may be necessary to consider
AR(1) based on PACF plot. AIC plot in Figure B7 (Appendix B) also supports this. The residuals of AR(1)
model is no longer correlated as shown in bottom panel of Figure 27.The residuals are then used to compare
if there is a difference between number of liver transplants before and after building the Roanoke center.
Table 12 reveals that there is no difference between the two since the p-value from both t-test and Wilcoxon
test is greater than 0.05. However, this test may not be valid because it is important to control for Region 11.
Figure 27. ACF and PACF of liver transplants series and residuals of AR(1)

Table 12. T-test and Wilcoxon test on residuals of AR(1) before and after the Roanoke center
T-test

Wilcoxon test

t-stat

DF

p-value

W-stat

p-value

-1.812

11.750

0.0957

32

0.0523

Linear model for liver transplants with Region 11 and Roanoke variables as predictor show that the
regression assumptions are violated based on diagnostic plot in Figure 16. The residual from the fitted linear
model indicates that the series are correlated. I then consider AR(1) to model the residuals since the PACF
plot cuts off after lag 1 (Figure 29). Accounting AR term in the linear model has improved the fitted model
based on diagnostic plot in Figure 30. The residuals vs. fitted plot shows constant variance and the QQ plot
shows the residuals are approximately normal. Moreover, the ACF and PACF plots indicate the residual
series are no longer correlated and no significant lags in both plot (Figure 31). The estimated parameters and

22

its corresponding standard errors for linear model and linear model with time series are summarized Table
13.
Figure 28. Diagnostic plot for linear regression model on number of liver transplants

Figure 29. Plot of residuals from linear regression model and its corresponding ACF and PACF plot

Figure 30. Diagnostic plot for linear regression model with time series on number of liver transplants

Figure 31. Plot of residuals from linear model with time series and its corresponding ACF and PACF plot

23

Table 13. Results for linear model and linear model with time series
Linear model with time
Linear model (Model 5)
series
Estimate (se)
p-value
Estimate (se)
p-value
0

31.482 (12.79)

0.022

42.972 (12.23)

0.002

0.012 (0.029)

26.846 (14.38)

0.690

-0.011 (0.027)

0.704

0.075

34.011 (12.62)

0.014

0.428 (0.178)

0.026

1
Adj. R2

0.391

0.516

AIC

219.9

203.0

MSE

280.7
F-statistic: 8.691 on 2 and 22
DF, p-value: 0.001653

182.3
F-statistic: 9.18 on 3 and 20
DF, p-value: 0.0005048

Overall model

Both models are significant overall at 5% level. Linear regression model with time series performs best
based on adjusted R2, AIC, and MSE criteria. The adjusted R2 is larger and AIC is smaller for linear model
after accounting AR term in the model. Also, it fits better because it has decreased the MSE value. My final
linear model for liver transplants can be written as follows:
= 42.972 0.011 11 + 34.011 + 0.428 1
Bootstrapping the regression for Roanoke center (2 *) shows that the confidence interval of the parameter
does not contain zero (Table 14). This means that I can be 95% confident that there is a difference between
the number of liver transplants before and after the Roanoke center is built. Thus, I can reject my null
hypothesis at 5%, meaning building Roanoke center helps to improve the number of liver transplants at
UVA. The histogram and QQ-plot for the bootstrap estimate of 2 * shows no violation of normality (Figure
32).
Table 14. Bootstrap estimate for linear model with time series
Std.
Original
Bias
95% Percentile CI
95% Bca CI
Error
0 *

42.972

0.128

11.071

(20.08, 63.80 )

(18.71, 62.57 )

1 *

-0.011

0.000

0.025

(-0.0582, 0.0408 )

(-0.0552, 0.0431 )

2 *

34.011

-0.120

11.376

(11.30, 56.26 )

(10.28, 55.20 )

1 *

0.428

0.002

0.165

( 0.1148, 0.7610 )

( 0.1077, 0.7586 )

24

Figure 32. Histogram and QQ-plot of bootstrap estimate of Roanoke center

Considering Poisson model to model the number liver transplants at UVA as a response variable, the
dispersion of Model 6.1 is 8.32. Since this number is not close to 1 therefore I use Quasi Poisson model
instead. The diagnostic plot for the residuals of this model shows that the tail distribution lack of fit Gaussian
in normal QQ plot. The residual vs. fitted plot shows non-constant variance (Figure 33). The PACF of
residuals in Figure 34 appears to be insignificant after lag 1 then I consider one autoregressive term in the
Poisson model (Model 6.2). The model looks better now as shown in the diagnostic plot (Figure 35). Also,
the residual series are no longer correlated (Figure 36). This is also true for Model 6.3 where Roanoke center
is included as a predictor variable (Figure 37-38). The comparison between the estimated parameters from
Poisson model, Quasi Poisson model, and Quasi Poisson model with time series are summarized in Table 15.
Figure 33. Diagnostic plot for Quasi Poisson model (Model 6.1)

25

Figure 34. Plot of residuals from Quasi Poisson model (Model 6.1) and its corresponding ACF and PACF plot

Figure 35. Diagnostic plot for Quasi Poisson model with time series (Model 6.2)

Figure 36. Plot of residuals from Quasi Poisson model with time series (Model 6.2) and its corresponding ACF and
PACF plot

Figure 37. Diagnostic plot for Quasi Poisson model with time series and Roanoke center (Model 6.3)

26

Figure 38. Plot of residuals from Quasi Poisson model with time series and Roanoke center (Model 6.3) and its
corresponding ACF and PACF plot

The model utility test shows that I can reject my null hypothesis and prefer the full model. The QuasiPoisson model for Roanoke center (Model 6.3) does not show any significant effect of building Roanoke
center to the response variable. The Roanoke center variable has p-value 0.078 greater than significant level
0.05.
Table 15. Results for (Quasi) Poisson model
Poisson Model (6.1)
Estimate
(se)
3.105
(0.085)
0.001
(0.0001)

0
1

p-value
< 0.0001
< 0.0001

Quasi Poisson
Model with time
series (6.2)

Quasi Poisson
Model (6.1)
Estimate
(se)
3.105
(0.246)
0.001
(0.0004)

Estimate
(se)
3.173
(0.197)
0.001
(0.0003)

p-value
< 0.0001
0.003

p-value
< 0.0001
0.001

2
0.083
0.003
(0.024)
Deviance: 117.74,
DF:2, P-value <
6.66E-06

1
Model
utility test
Dispersion

Deviance: 96.195,
DF:1, P-value < 2.2E16
8.32

Deviance: 96.195,
DF:1, Pvalue:0.00674

Table 16. Test of Roanake center


Resid. Resid.
Df
Dev
21
96.734
Model 6.3 without Roanoke center
Model 6.3

20

82.261

Quasi Poisson Model


with time series and
Roanoke variable (6.3)
Estimate
p-value
(se)
3.616
< 0.0001
(0.289)
0.00005
0.944
(0.001)
0.545
0.078
(0.293)
0.069
0.007
(0.023)
Deviance: 132.21,
DF:3, P-value < 6.11E07

Df

Deviance

Pr(>Chi)

14.473

0.06256

My final Quasi Poisson model with time series for liver transplants at UVA can be expressed as follows:
= 3.616 + 0.00005 11 + 0.545 + 0.069 1
The model is significant overall at 5% level. Test of Roanoke center (1 ) with chi squared confirms that
Roanoke is not a significant factor and can be dropped from the model. In other words, I cannot reject my
27

null hypothesis of building the Roanoke center does not increase the number of liver transplants at UVA at
5% level. This contradicts the result obtained from linear model with time series.

4. Recommendation
It is evidence that modeling the difference of kidney transplants at UVA and MCV and at UVA and Duke
with time series linear regression model fits better than linear model without time series based on adjusted
R2, AIC, and MSE criteria. The final model selected is significant overall at 5% level. By using the bootstrap
and Monte Carlo simulation on prediction the time series model for the difference number of kidney
transplants, at level p<0.05, I get the negative confidence interval of the prediction in 2013 that does not
contain zero. This means that I can be 95% sure that there is indeed a large difference between the two
centers in 2013. The negative confidence interval indicates that the predicted number of kidney transplants at
UVA overall and for non-whites in 2013 is less than the predicted number of kidney transplants at MCV and
Duke. A classical paired t-test also shows a significant result at 5% level comparing the difference of kidney
transplants of non-whites at UVA and the other two centers. This tells me that UVA center needs to do better
at recruiting people overall and at recruiting people from other ethnicities (non-whites).
For liver transplants, two models are considered to model the number liver transplants at UVA, i.e. linear
model and Poisson model. Similar as previous analysis, accounting AR terms in the model does improve the
fitted model. The results from the two models seem to contradict in terms of showing the difference number
of liver transplants before and after building the Roanoke center. Based on linear model with time series, the
p-value of Roanoke variable is 0.014 and is less than 0.05. With this model, I can reject my null hypothesis
that building the Roanoke center does not increase the number of liver transplants at 5%. The bootstrapping
for time series model confirms this since the 95% confidence interval of the Roanoke center does not contain
zero. Meanwhile based on Poisson model with time series, Roanoke variable is not significant at 5% level
and should be removed from the model based on chi square test. This tells me that I cannot conclude
anything about the effect of Roanoke center on the number of liver transplants at UVA. Therefore, I suggest
doing more research on liver transplants. It may be interesting to collect the data at UVA C-ville and UVA
Roanoke center.

5. References
[1] D. E. Brown and L. Barnes, Project 3: Design Improvements for the UVA Transplant Center,"
November 2013, assignment in class SYS 4021.
[2] D. E. Brown and L. Barnes, Project 3: Design Improvements for the UVA Transplant Center template,"
November 2013, assignment in class SYS 4021.
[3] OPTN: Organ Procurement and Transplantation Network. http://optn.transplant.hrsa.gov.
[4] ScholarlyEditions, Issue in Neurology Research and Practice 2011 Edition, 2012, Scholarly Editions,
Atlanta, Georgia.
[5] University of Virginia School of Law, Hospitals, Clinics, and Outpatient Services, [assessed on
12/05/2013], http://www.law.virginia.edu/html/insider/health_hospitals.htm

28

Appendix A
Table A 1. Number of kidney transplants from 1988 to 2013
r11donor

UVA

MCV

Duke

1988

456

24

41

1989

493

39

1990

618

1991

Year

r11donor

UVA

MCV

Duke

NW

NW

NW

NW

61

254

46

20

24

17

39

22

47

49

265

42

28

11

20

27

36

13

56

57

43

363

53

43

13

22

35

28

15

638

54

34

62

344

57

43

11

14

20

40

22

1992

625

53

38

47

325

62

48

14

24

23

24

1993

659

43

43

47

352

83

32

11

15

28

30

17

1994

720

34

37

68

390

73

24

10

13

24

47

21

1995

719

53

29

59

373

72

42

10

15

14

36

23

1996

806

68

38

72

411

81

54

14

11

27

30

42

1997

838

74

53

78

402

94

54

20

18

35

38

40

1998

889

69

53

70

408

91

49

20

21

32

27

43

1999

937

63

52

70

418

101

46

17

17

35

33

37

2000

949

55

64

86

451

93

44

11

25

39

42

44

2001

1005

58

68

111

432

111

45

12

22

46

67

44

2002

1049

63

82

96

456

109

50

13

33

49

53

43

2003

1105

68

99

121

434

126

52

16

34

65

69

52

2004

1166

73

96

119

501

135

52

21

27

69

60

59

2005

1284

94

112

100

549

167

73

21

26

86

51

49

2006

1337

107

107

95

653

203

73

33

38

69

42

53

2007

1268

102

106

84

611

209

75

27

36

70

40

44

2008

1237

103

108

108

615

206

74

29

32

76

43

65

2009

1281

79

137

105

606

251

55

24

43

94

41

64

2010

1255

69

130

104

608

254

45

24

35

95

37

67

2011

1275

76

124

115

600

251

52

24

48

76

43

72

2012

1304

68

136

74

687

219

42

26

41

95

41

33

2013

791

44

87

56

401

155

26

18

24

63

23

33

W: number of kidney transplants of white people


NW: number of kidney transplants of non-white people

29

Table A 2. Number of liver transplants from 1988 to 2013


Year

R11donor

UVA

MCV

Duke

1988

148

21

11

1989

182

17

18

22

1990

253

54

16

31

1991

261

51

27

34

1992

270

36

31

33

1993

344

66

37

21

1994

386

62

33

38

1995

372

54

39

37

1996

433

37

66

37

1997

439

24

60

32

1998

498

23

53

48

1999

510

23

60

25

2000

550

37

45

34

2001

550

40

46

36

2002

571

29

46

38

2003

569

28

57

35

2004

643

36

57

41

2005

757

40

54

41

2006

884

58

55

46

2007

833

83

47

30

2008

838

86

54

44

2009

883

87

55

57

2010

819

78

48

51

2011

781

45

46

63

2012

811

68

60

67

2013

523

44

36

38

30

Appendix B
Figure B 1. Histogram and QQ-Plot on mean difference of kidney transplants at UVA and other centers

Figure B 2. Histogram and QQ-plot of bootstrap estimate of UVA - MCV

Figure B 3. Histogram and QQ-plot of bootstrap estimate of UVA - Duke

31

Figure B 4. Histogram and QQ-plot of bootstrap estimate of UVA - MCV of non-whites

Figure B 5. Histogram and QQ-plot of bootstrap estimate of UVA - Duke of non-whites

Figure B 6. Histogram and QQ plot of number of liver transplants before and after the Roanoke center

32

Figure B 7. AIC plot for AR model

33

You might also like