You are on page 1of 10

JACC: CARDIOVASCULAR IMAGING VOL. -, NO.

-, 2017
ª 2017 BY THE AMERICAN COLLEGE OF CARDIOLOGY FOUNDATION ISSN 1936-878X/$36.00

PUBLISHED BY ELSEVIER https://doi.org/10.1016/j.jcmg.2017.07.024

Prognostic Value of Combined Clinical and


Myocardial Perfusion Imaging Data Using
Machine Learning
Julian Betancur, PHD,a Yuka Otaki, MD,a Manish Motwani, MB, CHB, PHD,a Mathews B. Fish, MD,b
Mark Lemley, CNMT,b Damini Dey, PHD,a Heidi Gransar, MS,a Balaji Tamarappoo, MD, PHD,a Guido Germano, PHD,a
Tali Sharir, MD,c Daniel S. Berman, MD,a Piotr J. Slomka, PHDa

ABSTRACT

OBJECTIVES This study evaluated the added predictive value of combining clinical information and myocardial
perfusion single-photon emission computed tomography (SPECT) imaging (MPI) data using machine learning (ML) to
predict major adverse cardiac events (MACE).

BACKGROUND Traditionally, prognostication by MPI has relied on visual or quantitative analysis of images
without objective consideration of the clinical data. ML permits a large number of variables to be considered in
combination and at a level of complexity beyond the human clinical reader.

METHODS A total of 2,619 consecutive patients (48% men; 62  13 years of age) who underwent exercise (38%) or
pharmacological stress (62%) with high-speed SPECT MPI were monitored for MACE. Twenty-eight clinical variables, 17
stress test variables, and 25 imaging variables (including total perfusion deficit [TPD]) were recorded. Areas under
the receiver-operating characteristic curve (AUC) for MACE prediction were compared among: 1) ML with all available
data (ML-combined); 2) ML with only imaging data (ML-imaging); 3) 5-point scale visual diagnosis (physician [MD]
diagnosis); and 4) automated quantitative imaging analysis (stress TPD and ischemic TPD). ML involved automated
variable selection by information gain ranking, model building with a boosted ensemble algorithm, and 10-fold stratified
cross validation.

RESULTS During follow-up (3.2  0.6 years), 239 patients (9.1%) had MACE. MACE prediction was significantly higher for
ML-combined than ML-imaging (AUC: 0.81 vs. 0.78; p < 0.01). ML-combined also had higher predictive accuracy compared with
MD diagnosis, automated stress TPD, and automated ischemic TPD (AUC: 0.81 vs. 0.65 vs. 0.73 vs. 0.71, respectively; p < 0.01
for all). Risk reclassification for ML-combined compared with visual MD diagnosis was 26% (p < 0.001).

CONCLUSIONS ML combined with both clinical and imaging data variables was found to have high predictive
accuracy for 3-year risk of MACE and was superior to existing visual or automated perfusion assessments. ML could
allow integration of clinical and imaging data for personalized MACE risk computations in patients undergoing SPECT
MPI. (J Am Coll Cardiol Img 2017;-:-–-) © 2017 by the American College of Cardiology Foundation.

T raditionally, the prognostic value of myocar-


dial
computed
perfusion
tomography
single-photon
(SPECT)
(MPI) has been studied with semiquantitative visual
emission
imaging
and quantitative analysis of image data (1–3). A
number of previous studies have shown that clinical
demographics, functional parameters, and hemody-
namic and stress results all affect the evaluation of

From the aDepartments of Imaging, Medicine, and Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, California;
b
Oregon Heart and Vascular Institute, Sacred Heart Medical Center, Springfield, Oregon; and the cDepartment of Nuclear
Cardiology, Assuta Medical Centers, Tel Aviv, Israel. This research was supported in part by grant R01HL089765 from the
National Heart, Lung, and Blood Institute/National Institute of Health (PI: Piotr Slomka). The content is solely the responsibility
of the authors and does not necessarily represent the official views of the National Institutes of Health. Drs. Betancur and Otaki
contributed equally to this work. Drs. Berman, Germano, and Slomka have received royalties from Cedars-Sinai Medical Center.
All other authors have reported that they have no relationships relevant to the contents of this paper to disclose.

Manuscript received March 27, 2017; revised manuscript received July 5, 2017, accepted July 5, 2017.
2 Betancur et al. JACC: CARDIOVASCULAR IMAGING, VOL. -, NO. -, 2017
Machine Learning for Automated MACE Prediction - 2017:-–-

ABBREVIATIONS MPI (4–7). This integration of clinical informa- equivalent to a total average effective dose of
AND ACRONYMS tion and imaging data into a final impression is 10.7 mSv based on the latest International Commis-
currently performed subjectively by physi- sion on Radiological Protection 103 estimates (13).
CAD = coronary artery disease
cians when they assess the MPI test, often in Patients underwent symptom-limited Bruce protocol
CT = computed tomography
a nonstandardized manner. exercise testing (38%) or pharmacological stress (62%;
MACE = major adverse cardiac
Machine learning (ML) is a field of com- regadenoson 0.4 mg) with injection at peak stress.
events
puter science that uses computer algorithms Resting image acquisition was performed supine with
MD = physician
to identify patterns in large multivariable 6- to 10-min acquisition time, based on patient body
ML = machine learning
datasets and can be used to predict out- mass index. Upright and supine stress imaging (4 to
MPI = myocardial perfusion
comes. In recent years, ML has been used for 6 min) began 15 to 30 min after stress.
imaging
prediction and decision-making in a multi- Transaxial images were generated from list mode
SPECT = single-photon
emission computed
tude of disciplines, including internet search data maximum likelihood expectation maximization
tomography engines, customized advertising, natural reconstruction (11). No attenuation or scatter correc-
TID = transient ischemic language processing, finance trending, and tion was applied. Images were automatically
dilation robotics (8–10). For MPI, a large number of re-oriented into short-axis, and vertical and horizontal
TPD = total perfusion deficit parameters, including clinical variables, long-axis slices with Quantitative Perfusion SPECT
stress test results, and imaging data variables, could (QPS)/Quantitative Gated SPECT (QGS) software
be considered by ML for outcome prediction. We (Cedars-Sinai Medical Center, Los Angeles, California).
evaluated the benefits of combining all of these
VISUAL PERFUSION ANALYSIS. The visual analysis
variables using an ML algorithm to predict major
was done by multiple MDs who were aware of patient
adverse cardiac events (MACE) (8). ML prediction
clinical information and quantitative assessment at
using combined data was also compared with physi-
the time of the study. Reader scan interpretation (MD
cian (MD) diagnosis (based on a visual read with
diagnosis) was scored as 0 ¼ normal, 1 ¼ equivocal,
awareness of clinical data) and with automated
2 ¼ probably abnormal, 3 ¼ abnormal, or 4 ¼ defi-
perfusion quantification indexes (stress and ischemic
nitely abnormal. A 3-step scale probability of CAD was
total perfusion deficit [TPD]).
also reported (0 ¼ low, 1 ¼ intermediate, 2 ¼ high).
METHODS AUTOMATED QUANTIFICATION. All image datasets
were de-identified, transferred to Cedars-Sinai Med-
STUDY POPULATION. A total of 2,689 consecutive ical Center, and quality control was checked by a
patients who were referred for clinically indicated single experienced core laboratory technologist
exercise or pharmacological stress MPI at Sacred without knowledge of clinical data. Automatically
Heart Medical Center between January 2010 and generated myocardial contours by QPS/QGS software
December 2011 were included. The study was were evaluated, and when necessary, contours were
approved by the institutional review board, including adjusted to correspond to the myocardium. Upright
a waiver for informed consent. After excluding 70 and supine images were quantified as previously
patients with early revascularization within 90 days, described (14). We used automatic TPD, a quantita-
2,619 patients were included for further analysis. tive perfusion variable that reflects a combination of
CLINICAL DATA. Clinical data were derived from defect extent and severity, and produces stress, rest,
patients’ medical records and included age, sex, and and ischemic (stress – rest) TPD values. Ejection
risk factors. Recorded risk factors were hypertension, fraction, and systolic and diastolic volumes at stress
diabetes mellitus, dyslipidemia, and smoking and rest were quantified separately for each acquisi-
(defined as current smoking or cessation within 3 tion using standard QGS software with 8 frames per
months of testing), and family history of premature cardiac cycle. Transient ischemic dilation (TID) was
clinical coronary artery disease (CAD). Presence of computed as previously described (15). Counts in the
chest pain, and type and shortness of breath were left ventricle were obtained by planar projections of
assessed by the stress testing MD. the left ventricular region defined during the first
step of data reconstruction (16).
MPI AND STRESS PROTOCOLS. Resting and/or stress
99m
1-day technetium-sestamibi imaging was per- OUTCOME AND FOLLOW-UP DATA COLLECTION.
formed using a high-efficiency, solid-state SPECT The endpoint was MACE, which consisted of
scanner (D-SPECT, Spectrum-Dynamics, Haifa, Israel) all-cause mortality, nonfatal myocardial infarction,
(11). Weight-adjusted doses of 353  151 MBq (9.5  unstable angina, or late coronary revascularization
4.1 mCi) for rest and 1,252  196 MBq (34  5.3 mCi) for (percutaneous coronary intervention or coronary
stress (recommended by vendor) were used (12), artery bypass grafting). All-cause mortality was
JACC: CARDIOVASCULAR IMAGING, VOL. -, NO. -, 2017 Betancur et al. 3
- 2017:-–- Machine Learning for Automated MACE Prediction

F I G U R E 1 Machine Learning Pathway

Data – 2,619 Cases with Imaging, Stress Test and Clinical Data

10% holdout for Testing


Variable Selection – Information Gain Ratio Ranking
10 10%
1
9 2
Stratified 10-Fold Cross Validation

Repeat 8 3
Model Building – LogitBoost × 10
× 10
7 4
6 5

90% for Training

Derive MACE probability scores for entire population from 10 models

Model: 1 2 3 ... ... 10

Estimate overall prediction by combining all probability scores

The overall population is divided into 10 equally sized groups (1, 2,., 10) with approximately the same incidence of major adverse cardiac
events (MACE) (stratified). Of the 10 groups, 1 (10%) is retained as the test set (holdout set), and the others (90%) are used as the training
set. To estimate the machine learning (ML) performance for all the data, the cross-validation procedure loops 10 times over these groups, each
time performing variable selection and model building with a different training set, and then testing this model on the unseen test set.
Therefore, each data point is used once for testing and 9 times for training, and the result is 10 experimental LogitBoost models trained on
90% fractions. Once finished, the estimates of MACE probability for each of the 10 holdout sets derived by the corresponding 10 models are
concatenated to provide an overall expected estimate of ML performance with unseen (holdout) data.

determined from the Social Security Death Index and information gain ratio (18). Information gain ratio
combined with MACE obtained from the hospital offers a measure of the effectiveness of a variable in
electronic medical records, including all clinics, as classifying the training data. Only variables that
well as cardiology group and hospital visits. Nonfatal resulted in an information gain ratio >0 were subse-
myocardial infarction was defined based on the quently used in model building (Figure 2B).
criteria of hospital admission for chest pain, elevated MODEL BUILDING. Predictive classifiers for MACE
cardiac enzyme levels, and typical changes on the scoring were developed by an ensemble (“boosting”)
electrocardiogram (17). The first event in each patient LogitBoost algorithm. The principle behind ML
was used as the outcome. Patients with early ensemble boosting is to combine the prediction of
revascularization #90 days after MPI were excluded. simple classifiers with weak performances to create a
MACHINE LEARNING. Figure 1 illustrates the ML single strong classifier (19). These weak predictions
pathway, which involved automated variable selec- are then combined in an ensemble (weighted
tion by information gain ratio ranking and model majority voting) to derive an overall classifier, the ML
building with a boosted ensemble algorithm, both score.
worked into a stratified 10-fold cross validation pro- CROSS VALIDATION. The performance and general
cedure, as reported in our previous work (8). ML error estimation of the entire ML process (variable
techniques were implemented in the open-source selection and LogitBoost) were assessed using strati-
Waikato Environment for Knowledge Analysis fied 10-fold cross validation (Figure 1), which is
(WEKA) platform 3.8.0 (University of Waikato, currently the preferred validation technique in ma-
Hamilton, New Zealand) (18). chine learning (18). The main advantages of this
VARIABLE SELECTION. Twenty-five imaging data technique, compared with the conventional split-
variables, 17 stress test variables, and 28 clinical sample approach, are: 1) it reduces the variance in
variables were available for variable selection by the prediction error; 2) it maximizes the use of data for
4 Betancur et al. JACC: CARDIOVASCULAR IMAGING, VOL. -, NO. -, 2017
Machine Learning for Automated MACE Prediction - 2017:-–-

F I G U R E 2 Variable Selection

A Information Gain Ratio B AUC


0 0.02 0.04 0.06 0.08 0.1 0.48 0.58 0.68 0.78

Stress EF (%) Stress supine TPD (%)


Rest EF (%) Stress heart rate (beats/min)
Rest TPD (%) Ischemic supine TPD (%)
Stress supine TPD (%) Stress upright TPD (%)
Stress EDV (ml) Stress systolic BP peak (mm Hg)
Stress upright TPD (%) Stress combined TPD (%)
Rest EDV (ml) Rest TPD (%)
Stress combined TPD (%) Age (yrs)
Ischemic supine TPD (%) Stress diastolic BP peak (mm Hg)
Body mass index (kg/m2) Pharmocological stress agent (1-5)
Stress heart rate (beats/min) Reason for termination (1-11)
Reason for termination (1-11) Rest ECG abnormality (0,1)
Location of patient (1-3) ECG response to stress (1-5)
Rest ECG abnormality (0,1) Transcient ischemic dilation
Past PCI (0,1) Stress EF (%)
Stress systolic BP peak (mm Hg) Location of patient (1-3)
Past myocardial infarction (0,1) Exercise protocol (Bruce , modified Bruce)
Exercise stress (0,1) Exercise stress (0,1)
Past other open heart surgery (0,1) Stress EDV (ml)
Weight (kg) Resting BP diastole (mm Hg)
Past CABG (0,1) Rest EF (%)
Rest scan (0,1) Diabetes mellitus (0,1)
Stress diastolic BP peak (mm Hg) ST changes at rest (0,1)
Post TAVR (0,1) LV count rest supine
Age (yrs) Rest EDV (ml)
LV count rest supine Body mass index (kg/m2)
Resting BP diastole (mm Hg) Past PCI (0,1)
ECG response to stress (1-5) Stress upright scan time (min)
LV counts stress upright Hypertension (0,1)
Peripheral vascular disease (0,1) Stress dose (MBq)
Carotid artery disease (0,1) Quality of study (1-5)
LV counts stress supine Weight (kg)
Transcient ischemic dilation Family history (0,1)
Diabetes mellitus (0,1) LV counts stress supine
Rest dose (MBq) Stres supine scan time (min)
Stress dose (MBq) Past CABG (0,1)
Presenting symptoms (1-4) Rest dose (MBq)
Quality of study (1-5) LV counts stress upright
Imaging protocol (1,2) Past myocardial infarction (0,1)
Hypertension (0,1) Maximal predicted heart rate (beats/min)
Family history (0,1) Resting BP systole (mm Hg)
Clinical Indications for test (1-22) Carotid artery disease (0,1)
Clinical response to stress (1-5) Peripheral vascular disease (0,1)
Exercise duration (min) Sex (M,F)
Sex (M,F) Chest pain with exercise index (0-2)
Pharmocological stress agent (1-5) Clinical Indications for test (1-22)
Stress upright scan time (min)
Information gain ratio > 0
Resting heart rate (beats/min)
Time of ECG changes response (min) Exercise work load (METs)
Information gain ratio = 0
Under drug influence Time of ECG changes response (min)
ST deviation direction (elevation, depression) ST deviation direction (elevation, depression)
ST sloping (up, down, horizontal) ST sloping (up, down, horizontal)
Artifacts (0,1) Under drug influence
Post cardiac transplant (0,1) Heart rhythm (1-4)
Height (cm) Dyslipidemia (0,1)
Dyslipidemia (0,1) Smoking (0,1)
Exercise work load (METs) Height (cm)
Smoking (0,1) Stress ST deviation at stress (mm)
Chest pain with exercise index (0-2) Rest scan time (min)
Stress ST deviation at stress (mm) Imaging protocol (1,2)
Rest scan time (min) Artifacts (0,1)
Stres supine scan time (min) Left ventricular hypertrophy (0,1)
Heart rhythm (1-4) Conduction disease (0,1)
Old myocardial infarction (0,1) Post TAVR (0,1)
Exercise protocol (Bruce , modified Bruce) Past other open heart surgery (0,1)
Conduction disease (0,1) Rest scan (0,1)
Resting BP systole (mm Hg) Exercise duration (min)
Resting heart rate (beats/min) Presenting symptoms (1-4)
Left ventricular hypertrophy (0,1) Clinical response to stress (1-5)
Maximal predicted heart rate (beats/min) Post cardiac transplant (0,1)
ST changes at rest (0,1) Old myocardial infarction (0,1)

(A) Twenty-five imaging data (gray bars: 22 selected), 17 stress test (pink bars: 8 selected) and 28 clinical (green bars: 17 selected) variables ranked by their mean
(95% confidence interval [CI]) information gain ratio within 10-fold cross-validation. (B) Same variables ranked by their individual area under the receiver-operating
characteristic curve (AUC) [95% CI] for MACE prediction. Variables selected by information gain ratio are shown as solid bars. Nonselected variables are shown by open
bars. BP ¼ blood pressure; beats/min ¼ beats per minute; CABG ¼ coronary artery bypass graft; ECG ¼ electrocardiography; EDV ¼ end-diastolic volume; EF ¼
ejection fraction; ESV ¼ end-systolic volume; LV ¼ left ventricular; MET ¼ metabolic equivalent; PCI ¼ percutaneous coronary intervention; TAVR ¼ transcatheter
aortic valve replacement; TPD ¼ total perfusion deficit; other abbreviations as in Figure 1.

both training and validation, without overfitting or STATISTICAL ANALYSIS. Using receiver-operating
overlap between the test and validation data; and 3) it characteristic analysis and pairwise comparisons
guards against testing hypotheses suggested by arbi- according to DeLong et al. (21), the predictive accu-
trarily split data (20). racy for MACE was compared among: 1) ML with all
JACC: CARDIOVASCULAR IMAGING, VOL. -, NO. -, 2017 Betancur et al. 5
- 2017:-–- Machine Learning for Automated MACE Prediction

available data (ML-combined); 2) ML with only im-


T A B L E 1 Patient Characteristics
aging data (ML-imaging); 3) a 5-point scale visual
diagnosis (MD diagnosis); and 4) automated quanti- All Patients MACEþ MACE
(N ¼ 2,619) (n ¼ 239) (n ¼ 2,380) p Value
tative imaging analysis (stress TPD and ischemic
Age, yrs 62  13 70  12 62  12 <0.0001
TPD). Brier score and Pearson correlation were
Men 1,247 (48) 128 (54) 1,119 (47) 0.054
computed between predicted and observed MACE Body mass index, kg/m2 31  8 30  9 32  8 <0.01
(22). For all analyses, MACE-free patients were CAD risk factors
censored to their follow-up date. To define the low- Diabetes 691 (26) 100 (42) 591 (25) <0.001
risk limit for MACE prediction by ML-combined, we Hypercholesterolemia 1,491 (57) 141 (59) 1,350 (57) 0.5

used clinical diagnosis ¼ 0, which is considered as Hypertension 1,692 (65) 181 (76) 1,511 (63) <0.001
Family history of CAD 1,006 (38) 66 (28) 940 (40) <0.001
definitely normal scans, as a well-established, low-risk
Smoker 662 (25) 65 (27) 597 (25) 0.474
limit. Then, low-risk cutoffs for ML-combined and TPD
Typical angina 301 (11) 38 (16) 263 (11) <0.05
were calculated for approximately the same popula- History of CAD
tion percentile as for the MD diagnosis ¼ 0 (87th Previous MI 130 (5) 31 (13) 99 (4) <0.001
percentile). Subsequently, improvement in risk Previous PCI 231 (9) 52 (22) 179 (8) <0.001
classification using ML-combined compared with the Previous CABG 172 (7) 36 (15) 136 (6) <0.001
MD diagnosis was assessed with a 5-category reclas-
Values are mean  SD or n (%).
sification. Statistical calculations were performed CABG ¼ coronary artery bypass graft; CAD ¼ coronary artery disease; MACE ¼ major adverse cardiac event;
using R software version 3.3.1 (R Foundation, Vienna, MI ¼ myocardial infarction; PCI ¼ percutaneous coronary intervention.

Austria) and PredictABEL package (R Foundation) for


the reclassification.
each individual variable. Stress TPD, stress heart rate,
ischemic TPD, stress systolic blood pressure, resting
RESULTS
TPD, and age were the best individual predictors.
Compared with the information gain ratio in
STUDY POPULATION AND OUTCOME. Table 1 shows
Figure 2A, there were some variables for which indi-
the baseline clinical characteristics of the studied
vidual AUCs were predictive, yet they did not offer
population. When the first event per patient was
incremental information gain for predicting MACE
considered, there were 239 (9.1%) 3-year MACE, with
(white bars). Furthermore, the variables with highest
150 (5.7%) all-cause deaths, 11 (0.4%) nonfatal MIs, 24
AUCs did not always have the highest information
(0.9%) unstable anginas, and 54 (2.1%) late target
gain ratio.
revascularizations. The observed annual MACE rate
MACE PREDICTION BY COMBINED VARIABLES. MACE
was 3%.
prediction was significantly higher for ML-combined
HEMODYNAMIC AND MPI RESULTS. Table 2 shows
hemodynamic and stress results separately for phar-
T A B L E 2 Pharmacologic and Exercise Stress Test Results
macological stress and for exercise stress. The fre-
quency of exercise stress was lower among patients Pharmacologic stress MACEþ MACE
(n ¼ 1,614) (n ¼ 217) (n ¼ 1,397) p Value
with MACE compared with those without MACE
Resting heart rate, beats/min 75  14 73  13 <0.05
(9% with MACE vs. 41% without MACE; p < 0.0001).
Peak heart rate at stress, beats/min 95  19 103  20 <0.0001
Table 3 shows quantitative and visual MPI results. For
Resting SBP, mm Hg 132  22 132  20 0.577
the quantitative evaluation of perfusion and func- Resting DBP, mm Hg 73  12 77  12 <0.001
tion, 9.8% of myocardial contours were corrected by Peak SBP, mm Hg 131  27 143  27 <0.0001
the core laboratory technologist. Peak DBP, mm Hg 70  12 76  13 <0.0001

VARIABLE SELECTION. Figure 2A shows the average Exercise stress MACEþ MACE
information gain ratio within 10-fold cross validation. (n ¼ 1,005) (n ¼ 22) (n ¼ 983) p Value

On average, 22 imaging data, 8 stress tests, and 17 Resting heart rate, beats/min 81  13 76  13 0.072
Peak heart rate at stress, beats/min 142  13 148  13 <0.05
clinical variables were selected. All perfusion and
Resting SBP, mm Hg 128  19 126  17 0.647
functional variables from MPI had an information
Resting DBP, mm Hg 74  9 79  10 <0.05
gain ratio >0, including left ventricular counts and
Peak SBP, mm Hg 179  27 181  25 0.703
injected dose. Top 9 selected variables were all Peak DBP, mm Hg 84  10 83  12 0.700
imaging data variables. Ischemic ST change during exercise stress 7 (32) 175 (18) 0.091

MACE PREDICTION BY INDIVIDUAL VARIABLES. Figure 2B


Values are mean  SD or n (%).
shows the area under the receiver-operating charac- DBP ¼ diastolic blood pressure; SBP ¼ systolic blood pressure; other abbreviation as in Table 1.
teristic curve (AUC) for the prediction of MACE by
6 Betancur et al. JACC: CARDIOVASCULAR IMAGING, VOL. -, NO. -, 2017
Machine Learning for Automated MACE Prediction - 2017:-–-

compared with the AUCs for probability of CAD (0.64;


T A B L E 3 Perfusion and Functional Results
95% CI: 0.61 to 0.66) or MD diagnosis (0.65; 95% CI:
MACEþ MACE 0.62 to 0.68), as reported by the MD (all p < 0.001).
(n ¼ 239) (n ¼ 2,380) p Value
When stress test variables were added to image var-
MD-diagnosis: normal 142 (59) 2,138 (90) <0.001
iables for ML integration, AUC did not change
MD-diagnosis: abnormal or definitely abnormal 89 (37) 217 (9) <0.001
Stress TPD, % 9  11 35 <0.0001
significantly (AUC: 0.79, 95% CI: 0.76 to 0.82 vs. AUC:
Ischemic TPD, % 44 23 <0.0001 0.78, 95% CI: 0.75 to 0.81; p ¼ 0.4).
Resting TPD, % 59 13 <0.0001 The Brier score for ML-combined prediction of
Stress EDV, ml 112  57 91  36 <0.0001 MACE was 0.07, which indicated good calibration
Stress ESV, ml 96  57 73  33 <0.0001 between ML scores (estimated predicted risk) and
Stress EF, % 46  9 49  3 <0.0001
observed 3-year risk. The plot of observed MACE
Rest EDV, ml 105  52 89  34 <0.0001
versus predicted MACE over percentiles of ML-
Rest ESV, ml 89  52 71  31 <0.0001
Rest EF, % 46  8 49  3 <0.0001
combined risk is shown in Figure 4. High correlation
Transient ischemic dilation 1.09  0.16 1.03  0.14 <0.0001 of ML-combined predicted MACE versus observed
MACE was found (r ¼ 0.97; p < 0.0001).
Values are n (%) or mean  SD.
RISK RE-CATEGORIZATION. To allow categorical
EDV ¼ end-diastolic volume; EF ¼ ejection fraction; ESV ¼ end-systolic volume; MD ¼ physician; TPD ¼ total
perfusion deficit; other abbreviation as in Table 1. comparison, a low-risk, ML-combined score (<0.15)
was determined as the cutoff that defined the
same percentile as visual MD diagnosis ¼ 0 (87th
percentile). This percentile also approximately
than ML-imaging (AUC: 0.81, 95% confidence interval
corresponded to the stress TPD threshold of <5% (14).
[CI]: 0.78 to 0.83 vs. AUC: 0.78, 95% CI: 0.75 to 0.81;
For patients within the 95th to 100th percentile
p < 0.01). ML-combined also had a higher AUC
of the ML-combined score, 19% (25 of 131) of
compared with the AUCs of automated stress TPD
patients had a normal MD diagnosis and 10% (13 or
and automated ischemic TPD (Figure 3), and
131) had stress TPD of <5% (Figure 5). Finally, a
5-category risk reclassification was 26% for
ML-combined scores compared with a 5-category MD
F I G U R E 3 ROC Curves for Prediction of 3-Year MACE (239 of 2,619 Events)
diagnosis (p < 0.001) (Table 4), with 30.5% improved
identification of patients with MACE and 5%
decreased identification of MACE-free patients (all
1.0
p < 0.001).
Machine Learning (ML)

DISCUSSION
0.8

We developed and validated a highly accurate,


personalized method for post-MPI risk computation
0.6
Sensitivity

that used ML. This approach allowed the combination


of all available clinical, stress test, and automatically

AUC (bars) and 95% CI (whiskers)


derived imaging data variables without a priori
0.4
assumptions about the influence or weighting of
ML-combined 0.81 individual factors, or how they may interact. The
* **
ML-imaging 0.78 method was used to evaluate the added value of
0.2
Stress TPD 0.73 clinical and stress test information for the prediction
Ischemic TPD 0.72 of MACE after MPI. The observed 3% annual MACE
0.0 rate was similar to previous studies that assessed the
prognostic value of SPECT MPI (4). The only human
1.0 0.8 0.6 0.4 0.2 0.0
input required for the derivation of the ML-combined
Specificity
MACE risk score was the collation of clinical data from
health records (conceivably a task fulfilled by
ML combining all variables using variable selection and LogitBoost algorithm
advanced text mining in the future) and the adjust-
(ML-combined) had a significantly higher AUC for MACE prediction than ML combining
imaging data variables only (ML-imaging), and standard image analysis. *p < 0.01;
ment of contours by the technologists in a minority
**p < 0.001, in AUC comparison by DeLong test. ROC ¼ receiver-operating (<10%) of the cases. Figure 6 illustrates how the
characteristic; other abbreviations as in Figures 1 and 2. proposed ML model would allow prediction of the
risk of MACE for an individual unknown case by
JACC: CARDIOVASCULAR IMAGING, VOL. -, NO. -, 2017 Betancur et al. 7
- 2017:-–- Machine Learning for Automated MACE Prediction

F I G U R E 4 Observed Versus Predicted 3-Year Risk of MACE

60 0.6

50 0.5
Observed: Proportion of Events (%)

Predicted: ML Score
40 0.4

30 0.3

20 0.2

10 0.1

0 0.0
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
Percentile of ML Score

Observed Predicted

Observed proportion of events (pink bars) and predicted ML score (green points) grouped by every fifth percentile of risk. Abbreviations as in Figure 1.

automatically integrating the clinical data with the TPD had better predictive value for MACE than a
imaging data. clinical diagnosis, which was in line with our previous
The performance of the ML-combined score was reports (9,23), but has not been previously reported in
superior to image risk metrics that are traditionally prognostic studies.
used to study prognostic outcomes after MPI (1–7). To our knowledge, this was the first study that
The AUC estimate, derived in a rigorous manner with applied ML to predict MACE in patients who
test and training data separated within 10-fold cross
validation (preventing overfitting) was substantially
higher than that for ML-imaging, as well as visual or F I G U R E 5 Frequency of Normal Clinical Diagnosis and Low Perfusion Scores by
automated MPI assessment. Furthermore, risk Predicted ML Risk Percentile

reclassification analysis demonstrated that the


Normal Clinical Diagnosis Stress TPD < 5%
ML-combined risk allowed better classification of
high-risk patients than visual clinical diagnosis. Risk 100
99% 97% 97%
93% 95%
reclassification revealed that the ML-combined score
80 87%
could increase the risk score for >30% of patients
Frequency (%)

with MACE incidence, but also increased the risk 60


69%

score for 5% of MACE-free patients. At the same time, 56%

we found that 19% of the patients in the highest 40


ML-combined risk category (top 5%), with a MACE
incidence of 38%, were still read as normal scans with 20
19%
a MD diagnosis ¼ 0. These observations highlight the 10%
0
difficulty in finding the appropriate thresholds for the
25

95

95
25

4
-7
-4

-9

-7
-4

-9

multicategory risk scores. The low-risk threshold in


<

<


50

50
75
25

75
25

this study was derived for the same population Percentile of ML Score
percentile as “normal” visual scans, and subsequent
higher risk thresholds were defined at 5% increments The frequency of patients with normal clinical diagnosis and low automated perfusion
of increasing ML risk score. Furthermore, we found score (TPD <5%) across percentiles of the ML score. Abbreviations as Figures 1 and 2.
that automatically derived stress and/or ischemic
8 Betancur et al. JACC: CARDIOVASCULAR IMAGING, VOL. -, NO. -, 2017
Machine Learning for Automated MACE Prediction - 2017:-–-

cardiology to provide multiple imaging data variables


T A B L E 4 Risk Reclassification by ML Versus MD Diagnosis
with limited manual interaction. The intent was to
ML-Boosting Risk Category demonstrate the feasibility of edging us closer to a
Low Equivocal Mild Moderate Severe completely automated computer-powered imaging
MD Diagnosis <0.15 0.15–0.2 0.2–0.25 0.25–0.3 $0.3 Total
analysis and risk assessment. A future direction and
MACE (n ¼ 239)
potential next step will be to develop tools that are
Normal 99 19* 9* 7* 8* 142
Equivocal 1† 0 1* 0* 2* 4
also capable of automatically extracting clinical vari-
Probably abnormal 2† 0† 0 1* 1* 4 ables, for example, by text mining electronic health
Abnormal 11† 5† 8† 7 55* 86 records.
Definitely abnormal 1† 1† 0† 1† 0 3 The ML approach provides a computational inte-
Total 114 25 18 16 66 239 gration of all available information that is not
No MACE (n ¼ 2,380)
feasible for subjective analysis by the reporting
Normal 1,959 95* 35* 16* 33* 2,138
physician. As part of the clinical decision-making,
Equivocal 5† 1 0* 2* 3* 11
physicians take into account clinical and stress
Probably abnormal 8† 0† 0 3* 3* 14
Abnormal 69† 29† 21† 23 67* 209 testing data; however, this is done subjectively
Definitely abnormal 3† 0† 1† 1† 3 8 without a systematic way of integrating information.
Total 2,044 125 57 45 109 2,380 Furthermore, although including these variables as
Reclassification 26% part of the MPI report is recommended by guide-
lines, integration of these findings in the report is
*Up-risking by machine learning (ML). †De-risking by ML.
Abbreviations as in Tables 1 and 3.
not yet part of standardized reporting guidelines
(24,25). Intuitive patient-specific weighting of all
individual clinical and imaging factors for assessing
underwent MPI. Recently, our group assessed the risk could not be expected to be precise, or consis-
feasibility and accuracy of ML to predict 5-year tent among different medical centers, whether
all-cause mortality in 10,030 patients who under- performed by the interpreting physician or the
went coronary computed tomography (CT) angiog- physician managing the patient.
raphy (8). In this analysis, ML exhibited a higher AUC Although the average patient radiation dose
compared with the Framingham risk score or visual (10.7 mSv) used in this study was higher than those
CT severity scores alone (8). Automated processing of specified in current guideline recommendations (26),
CT images was not used. In contrast, the present the data were collected before the latest guidelines
study capitalized on established automated process- were adopted, using the same day restfirst protocol
ing software tools that were validated in nuclear optimized for the acquisition speed rather than for

F I G U R E 6 Illustration of Prognostic Risk Computation in an Individual Patient by the Proposed ML Model

Imaging MACE Risk


Data
Variables
Myocardial Image MACE Risk Prediction
Perfusion SPECT Quantification
Imaging (QPS/QGS or Machine Learning Model
Stress, Rest Scans Equivalent)
Patient Physician

Stress test and Clinical Variables

Database

Electronic
Medical Records

QGS ¼ quantitative gated single-photon emission computed tomography; QPS ¼ quantitative perfusion single-photon emission computed tomography; SPECT ¼
single-photon emission computed tomography; other abbreviation as in Figure 1.
JACC: CARDIOVASCULAR IMAGING, VOL. -, NO. -, 2017 Betancur et al. 9
- 2017:-–- Machine Learning for Automated MACE Prediction

the radiation dose. Furthermore, a weight-based specific subpopulations, for example, in patients with
protocol was used, and most of the patients were suspected disease, patients with early revasculariza-
obese (body mass index $30 kg/m 2). It is likely that at tion, or patients undergoing adenosine protocols,
least a 50% lower effective radiation dose could be may be appropriate in multicenter studies. Risk
achieved with longer acquisition times without any reclassification metrics have limitations such as
effect on image quality, as previously studied (16). dependence on the choice of cutoff values of the
Further dose reductions could be achieved with continuous probability risk score. It is likely that
stress-first and/or stress-only protocols. more appropriate threshold selection in future
studies may optimize the reclassification patterns for
IMPLICATIONS. The ability to optimally assess risk
specific clinical risks. Alternatively, the MACE risk
in individual patients remains a major challenge in
score without any categories could be also used
cardiology. With MPI, visual image analysis itself is
clinically to indicate the probability of events for a
subjective, and the overall risk assessment that in-
given patient. Finally, we selected a LogitBoost
corporates clinical, stress test, and imaging results,
approach for automatic ML variables integration, as
is highly variable, based on physician knowledge
in our previous work (8), but the LogitBoost approach
and experience, and limited by the complexity of
we used is only one of many possible ML approaches
appropriately assigning weight to individual factors.
to combine multiple variables for prediction. It is
The presented ML score provides an automated
possible that different approaches such as deep
precise and objective risk estimate that combines
learning may provide more optimal risk score deri-
imaging, clinical, and stress testing variables. The
vation. However, a larger multicenter data set is
same optimal method for risk computation would
required to evaluate possible advantages of other ML
be readily available to all imaging centers, including
approaches.
less experienced centers. The practical imple-
mentation will depend on the ability to interface CONCLUSIONS
the MPI reporting workstation with electronic pa-
tient records, to access the clinical variables. Such a ML combining both clinical and imaging data vari-
tool could be perhaps interfaced with large registry ables was found to have high predictive accuracy for
data (e.g., the ImageGuide registry of the American the 3-year risk of MACE, and was superior to existing
Society of Nuclear Cardiology [25]), which could visual or automated perfusion assessments in isola-
collect clinical variables similar to those used in this tion. This computational method could allow inte-
study. The implementation will depend on the grating the clinical data with imaging results for the
availability of the interface to the electronic health optimal evaluation of MACE risk in patients under-
records. going MPI.

STUDY LIMITATIONS. This was a single-center study,


ADDRESS FOR CORRESPONDENCE: Dr. Piotr J.
and further multicenter and external validation of
Slomka, Artificial Intelligence in Medicine Program,
the derived risk score will be required. Future work
Cedars-Sinai Medical Center, 8700 Beverly Boule-
should include the definition of the optimal ML
vard, Suite A047N, Los Angeles, California 90048.
threshold, to validate prospective practical clinical
E-mail: Piotr.Slomka@cshs.org.
implementation. The sample size was modest and
follow-up was only 3 years; however, all results were
PERSPECTIVES
significant. Although training data were always
separated from test data within the 10-fold cross
COMPETENCY IN MEDICAL KNOWLEDGE: Combining
validation, it is not yet known how well such an ML
clinical and imaging information by an ML algorithm exhibited
score can extrapolate among different centers, pa-
significantly better MACE prediction than using only imaging
tient populations, and follow-up time. Although we
information or performing visual and automated perfusion
included key perfusion and function imaging vari-
assessment alone in SPECT MPI.
ables in this study, the list was not exhaustive. The
derived ML score was generic and could be applied to
TRANSLATIONAL OUTLOOK: Adding clinical information to
both pharmacological and stress protocols, because
imaging data by ML will aid comprehensive MPI assessment to
the ML technique uses the information about the
improve clinical patient management.
type of test internally. However, further evaluation
of ML risk stratification for MACE prediction in
10 Betancur et al. JACC: CARDIOVASCULAR IMAGING, VOL. -, NO. -, 2017
Machine Learning for Automated MACE Prediction - 2017:-–-

REFERENCES

1. Gimelli A, Rossi G, Landi P, et al. Stress/rest perfusion SPECT by machine learning in a large 18. Hall M, Frank E, Holmes G, Pfahringer B,
myocardial perfusion abnormalities by gated population. J Nucl Cardiol 2015;22:877–84. Reutemann P, Witten IH. The WEKA data mining
SPECT: still the best predictor of cardiac events in software: an update. SIGKDD Explor Newsl 2009;
10. Betancur J, Rubeaux M, Fuchs T, et al. Auto-
stable ischemic heart disease. J Nucl Med 2009; 11:10–8.
matic valve plane localization in myocardial
50:546–53.
perfusion SPECT/CT by machine learning: 19. Friedman J, Hastie T, Tibshirani R. Additive
2. Hachamovitch R, Kang X, Amanullah AM, et al. anatomical and clinical validation. J Nucl Med logistic regression: a statistical view of boosting
Prognostic implications of myocardial perfusion 2017;58:961–7. (with discussion and a rejoinder by the authors).
single-photon emission computed tomography in Ann Statist 2000;28:337–407.
11. Gambhir SS, Berman DS, Ziffer J, et al. A novel
the elderly. Circulation 2009;120:2197–206.
high-sensitivity rapid-acquisition single-photon 20. Kanamori T, Takenouchi T, Eguchi S, Murata N.
3. Shaw LJ, Berman DS, Maron DJ, et al. Optimal cardiac imaging camera. J Nucl Med 2009;50: Robust loss functions for boosting. Neural Comput
medical therapy with or without percutaneous 635–43. 2007;19:2183–244.
coronary intervention to reduce ischemic burden:
12. Sharir T, Slomka PJ, Hayes SW, et al. Multi- 21. DeLong ER, DeLong DM, Clarke-Pearson DL.
results from the Clinical Outcomes Utilizing
center trial of high-speed versus conventional Comparing the areas under two or more correlated
Revascularization and Aggressive Drug Evaluation
single-photon emission computed tomography receiver operating characteristic curves: a
(COURAGE) trial nuclear substudy. Circulation
imaging: quantitative results of myocardial nonparametric approach. Biometrics 1988;44:
2008;117:1283–91.
perfusion and left ventricular function. J Am Coll 837–45.
4. Shaw LJ, Iskandrian AE. Prognostic value of Cardiol 2010;55:1965–74.
22. Brier GW. Verification of forecast expressed in
gated myocardial perfusion SPECT. J Nucl Cardiol
13. Andersson M, Johansson L, Minarik D, Leide- terms of probability. Monthly Weather Rev 1950;
2004;11:171–85.
Svegborn S, Mattsson S. Effective dose to adult 78:1–3.
5. Kang X, Berman DS, Lewin HC, et al. Incre- patients from 338 radiopharmaceuticals esti-
23. Arsanjani R, Xu Y, Dey D, et al. Improved ac-
mental prognostic value of myocardial perfusion mated using ICRP biokinetic data, ICRP/ICRU
curacy of myocardial perfusion SPECT for detec-
single photon emission computed tomography in computational reference phantoms and ICRP
tion of coronary artery disease by machine
patients with diabetes mellitus. Am Heart J 1999; 2007 tissue weighting factors. EJNMMI Physics
learning in a large population. J Nucl Cardiol 2013;
138:1025–32. 2014;1:9.
20:553–62.
6. Hachamovitch R, Berman DS, Kiat H, et al. Ex- 14. Nakazato R, Tamarappoo BK, Kang X, et al.
24. Tragardh E, Hesse B, Knuuti J, et al. Reporting
ercise myocardial perfusion SPECT in patients Quantitative upright–supine high-speed SPECT
nuclear cardiology: a joint position paper by the
without known coronary artery disease: incre- myocardial perfusion imaging for detection of
European Association of Nuclear Medicine (EANM)
mental prognostic value and use in risk stratifica- coronary artery disease: correlation with invasive
and the European Association of Cardiovascular
tion. Circulation 1996;93:905–14. coronary angiography. J Nucl Med 2010;51:
Imaging (EACVI). Eur Heart J Cardiovasc Imaging
1724–31.
7. Sharir T, Germano G, Kang X, et al. Prediction of 2015;16:272–9.
myocardial infarction versus cardiac death by 15. Xu Y, Arsanjani R, Clond M, et al. Transient
25. Tilkemeier PL, Mahmarian JJ, Wolinsky DG,
gated myocardial perfusion SPECT: risk stratifica- ischemic dilation for coronary artery disease in
Denton EA. ImageGuide Update. J Nucl Cardiol
tion by the amount of stress-induced ischemia and quantitative analysis of same-day sestamibi
2015;22:994–7.
the poststress ejection fraction. J Nucl Med 2001; myocardial perfusion SPECT. J Nucl Cardiol 2012;
42:831–7. 19:465–73. 26. Henzlova MJ, Duvall WL, Einstein AJ, Travin MI,
Verberne HJ. ASNC imaging guidelines for SPECT
8. Motwani M, Dey D, Berman DS, et al. Machine 16. Nakazato R, Berman DS, Hayes SW, et al.
nuclear cardiology procedures: stress, protocols,
learning for prediction of all-cause mortality in Myocardial perfusion imaging with a solid-state
and tracers. J Nucl Cardiol 2016;23:606–39.
patients with suspected coronary artery disease: a camera: simulation of a very low dose imaging
5-year multicentre prospective registry analysis. protocol. J Nucl Med 2013;54:373–9.
Eur Heart J 2017;38:500–7.
17. Thygesen K, Alpert JS, White HD. Universal KEY WORDS machine learning, major
9. Arsanjani R, Dey D, Khachatryan T, et al. Pre- definition of myocardial infarction. Circulation adverse cardiac events, SPECT myocardial
diction of revascularization after myocardial 2007;116:2634–53. imaging

You might also like