Professional Documents
Culture Documents
L Tn Phng*
Khi nim
Hi quy tuyn tnh a bin (Multiple Linear Regression, vit tt l MLR) ging nh hi quy
tuyn tnh n, ch khc ch thay v ch c 1 bin c lp th hi quy tuyn tnh a bin c
t 2 bin c lp tr ln. Mi bin c lp c h s dc (slope) ring ca n. Phng trnh
hi quy tuyn tnh a bin c th c biu din nh sau:
Yi = 0 + 1Xi1 + 2Xi2 ++ nXin + i
Bc 2:
nh gi m hnh. Bao gm:
1. ngha thng k ca m hnh (ANOVA F test)
2. Kho st ngha ca tng bin c lp (Xi)
3. Kim tra gi nh (assumptions) ca m hnh, cn gi l kim tra tnh gi tr ca m
hnh (model validity). Bao gm:
a. Phn tch phn d: Phn d c tnh c lp, phng sai ging nhau (gi l
homoscedasticity), v trung bnh bng zero.
b. Tnh phn phi bnh thng ca phn d
c. Multicollinearity
4. Gii thch Adj-R2
5. n gin ha m hnh (Parsimony)
Bc 3:
Tin on, bao gm:
-
40
60
80
100
60
80
100
200
final
150
100
100
80
exam1
60
40
100
80
exam2
60
40
100
80
exam3
60
100
150
200
40
60
80
100
Source |
SS
df
MS
-------------+-----------------------------Model | 13731.5148
3 4577.17161
Residual | 143.445179
21 6.83072279
-------------+-----------------------------Total |
13874.96
24 578.123333
Number of obs
F( 3,
21)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
25
670.09
0.0000
0.9897
0.9882
2.6136
-----------------------------------------------------------------------------final |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------exam1 |
.3559382
.1213889
2.93
0.008
.1034962
.6083802
exam2 |
.5425188
.1008495
5.38
0.000
.3327908
.7522467
exam3 |
1.167444
.1030141
11.33
0.000
.9532148
1.381674
_cons | -4.336102
3.764226
-1.15
0.262
-12.16424
3.492034
------------------------------------------------------------------------------
.05
Density
.1
.15
.2
-5
10
Residuals
Kernel density estimate
Normal density
kernel = epanechnikov, bandwidth = .85
kim tra tnh c lp, cng phng sai ca phn d, cch n gin l v biu
chm ca phn d i vi cc gi tr tin on (fitted values) ca bin ph thuc. C th
trong v d ny ta v biu chm ca r (phn d) vi fitted values ca final. Stata c lnh
rvfplot thc hin biu ny nh hnh di y:
rvfplot, yline(0)
10
5
Residuals
0
-5
100
120
140
160
Fitted values
180
200
Ta thy rng biu chm khng cho thy mt hnh dng hay xu hng c bit no v cc
gi tr gn nh xoay quanh tr trung bnh bng zero (ng ngang mu ). Biu ny gi
mt phn d c lp, c phng sai bng nhau v trung bnh bng zero.
4. Kim tra multicollinearity: Stata dng lnh vif (Variance Inflation Factor) tnh
ton gi tr VIF. Kt qu cho thy c 3 bin u c VIF>5. Do , cn phi kim
tra li cc bin c lp ny. Tuy nhin, thng thng VIF<5 l tt, VIF t 5-10 c
th chp nhn c, nhng cn phi xem xt li khi VIF>10.
. vif
Variable |
VIF
1/VIF
-------------+---------------------exam1 |
7.81
0.128093
exam2 |
5.59
0.178990
exam3 |
5.16
0.193750
-------------+---------------------Mean VIF |
6.19
Bc 3:
-
. regress
Source |
SS
df
MS
-------------+-----------------------------Model | 13731.5148
3 4577.17161
Residual | 143.445179
21 6.83072279
-------------+-----------------------------Total |
13874.96
24 578.123333
Number of obs
F( 3,
21)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
25
670.09
0.0000
0.9897
0.9882
2.6136
-----------------------------------------------------------------------------final |
Coef.
Std. Err.
t
P>|t|
Beta
-------------+---------------------------------------------------------------exam1 |
.3559382
.1213889
2.93
0.008
.1817819
exam2 |
.5425188
.1008495
5.38
0.000
.2821267
exam3 |
1.167444
.1030141
11.33
0.000
.5712626
_cons | -4.336102
3.764226
-1.15
0.262
.
------------------------------------------------------------------------------
Number of obs
F( 7,
822)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
830
10.82
0.0000
0.0844
0.0766
11.535
-----------------------------------------------------------------------------scalescore |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age |
.0474926
.0276755
1.72
0.087
-.0068304
.1018155
_Idistance~2 | -1.441733
1.065138
-1.35
0.176
-3.532444
.6489789
_Idistance~3 |
3.085925
1.422831
2.17
0.030
.2931151
5.878735
_Idistance~4 |
6.846812
1.019919
6.71
0.000
4.844859
8.848764
_Imarital_2 |
1.3647
1.383331
0.99
0.324
-1.350578
4.079978
_Imarital_3 |
4.039166
4.593545
0.88
0.379
-4.977293
13.05562
_Imarital_4 |
4.446256
2.285178
1.95
0.052
-.039215
8.931728
_cons |
77.98277
1.558111
50.05
0.000
74.92442
81.04111
------------------------------------------------------------------------------
Lu : Lnh hi quy trn c thc hin trn Stata 10. i vi Stata t 12 tr i th khng
cn s dng tin t xi: trc lnh regression na.
Gii thch mt s k hiu:
Trong v d trn, cc bin _Idistance~2, _Idistance~3 v _Idistance~4 c to ra t bin
distance_r. Khng c s hin din ca bin _Idistance~1, v _Idistance~1 ng vai tr nh
bin tham chiu (reference variable) cho nn khng th hin trong bng kt qu. Nh vy, cc
h s km theo cc bin ny th hin s so snh trc tip vi bin tham chiu, c ngha l
_Idistance~2 so snh vi _Idistance~1, _Idistance~3 so snh vi _Idistance~1, v
_Idistance~4 so snh vi _Idistance~1. Tnh hung tng t nh vy i vi bin
_Imarital_2, _Imarital_3 v _Imarital_4.
Bc 2:
-
.01
Density
.02
.03
.04
-40
-20
0
Residuals
20
40
-40
-20
Residuals
0
20
40
75
80
85
Fitted values
90
95
Kim tra multicollinearity: S dng lnh vif nh trn sau khi phn tch hi quy cho
thy khng c bin noo c VIF>5. Do khng c hin tng multicollinearity,
. vif
Variable |
VIF
1/VIF
-------------+---------------------_Imarital_4 |
2.05
0.487912
_Imarital_2 |
1.76
0.568755
age |
1.33
0.751877
_Idistance~2 |
1.18
0.849218
_Idistance~4 |
1.17
0.851353
_Idistance~3 |
1.14
0.879844
_Imarital_3 |
1.10
0.908484
-------------+---------------------Mean VIF |
1.39
Bc 3:
-
Number of obs
F( 7,
822)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
830
10.82
0.0000
0.0844
0.0766
11.535
-----------------------------------------------------------------------------scalescore |
Coef.
Std. Err.
t
P>|t|
Beta
-------------+---------------------------------------------------------------age |
.0474926
.0276755
1.72
0.087
.0660505
_Idistance~2 | -1.441733
1.065138
-1.35
0.176
-.0490217
_Idistance~3 |
3.085925
1.422831
2.17
0.030
.07717
_Idistance~4 |
6.846812
1.019919
6.71
0.000
.2428217
_Imarital_2 |
1.3647
1.383331
0.99
0.324
.0436584
_Imarital_3 |
4.039166
4.593545
0.88
0.379
.0307896
_Imarital_4 |
4.446256
2.285178
1.95
0.052
.0929658
_cons |
77.98277
1.558111
50.05
0.000
.
------------------------------------------------------------------------------
Chen, X., Ender, P., Mitchell, M. and Wells, C. (2003). Regression with Stata, from
http://www.ats.ucla.edu/stat/stata/webbooks/reg/default.htm