You are on page 1of 52

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

Khai thc d liu kho st mc sng h gia nh Vit Nam (VHLSS)


lm ti nghin cu s dng phn mm Stata

1. Gii thiu v B d liu kho st mc sng h gia nh Vit Nam (VHLSS 2008)
2. Khi ng Stata 11
3. Mt vi lnh qun l d liu n gin
4. To bng tn s
5. Tnh cc thng k m t
6. S lc v tng quan & hi quy
7. Ni hai file d liu bng lnh Merge
8. Tr gip
Ph lc 1. M rng v hi quy bi
Ph lc 2. Mt s lnh qun l d liu nng cao
Ph lc 3. M hnh Logit
Ph lc 4. Cu trc lnh c bn trong Stata, vn trng s trong VHLSS
Ph lc 5. Kiu d liu; mt s lnh, hm, ton t thng dng

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

1. Gii thiu v D liu kho st mc sng h gia nh


cung cp thng tin v mc sng dn c phc v vic xy dng, nh gi chnh sch n nay,
Tng cc thng k tin hnh 6 cuc iu tra mc sng ln vi 2 tn gi khc nhau: kho st mc
sng dn c (1993-1994, 1997-1998); kho st mc sng h gia nh (nm 2002, nm 2004, nm
2006, nm 2008). C l, khong gn 2 nm na anh ch mi c c d liu VHLSS ca nm 2010!
Gn y nht l cuc kho st/iu tra mc sng (thng c vit tt l KSMS) h gia nh nm
2008. D liu iu tra t cuc iu tra ny c lu tr trong b d liu kho st mc sng h gia
nh nm 2008 (thng gi l VHLSS 2008). Chng ta c th khai thc b d liu ny lm ti
nghin cu/ bi vit chnh sch. (Bn c th lin h vi V X hi & Mi trng Tng cc thng k
v vn bn quyn trong vic s dng b d liu ny, hi cc thng tin cn thit )
tm hiu chi tit v cuc iu tra ny, v cch chn mu, t chc iu tra, phiu iu tra, cc khi
nim , chng ta cn c thm ti liu S tay kho st mc sng h gia nh 2008 do Tng cc
Thng k bin son. Dng nh, ngi phn tch VHLSS no cng cn c quyn s tay ny bn cnh.
Chng ta tm hiu s lc mt s thng tin chung v KSMS 2008
1.1 Mc ch ca kho st mc sng 2008
Thu thp cc thng tin lm cn c nh gi mc sng, nh gi tnh trng ngho i v phn
ho giu ngho phc v cng tc hoch nh cc chnh sch, k hoch v cc chng trnh mc tiu
quc gia ca ng v Nh nc nhm khng ngng nng cao mc sng dn c trong c nc, cc
vng v cc a phng.
Cung cp s liu tnh quyn s ch s gi tiu dng.
Ngoi ra, thu thp thng tin phc v nghin cu, phn tch mt s chuyn v qun l iu
hnh v qun l ri ro v phc v tnh ton ti khon quc gia.
1.2 Ni dung ca kho st mc sng 2008
KSMS 2008 gm nhng ni dung ch yu phn nh mc sng ca cc h gia nh trn c nc
v nhng iu kin kinh t x hi c bn (c im ca x/phng) c tc ng n mc sng ca
ngi dn ni h sinh sng. Cc ni dung c th bao gm:
a. i vi h gia nh
- Mt s c im v nhn khu hc ca cc thnh vin trong h, gm: Tui, gii tnh, dn tc,
tnh trng hn nhn.
- Thu nhp ca h gia nh, gm: Mc thu nhp; thu nhp phn theo ngun thu (tin cng, tin
lng; hot ng sn xut t lm nng nghip, lm nghip, thu sn; hot ng ngnh ngh sn xut
kinh doanh dch v t lm ca h gia nh; thu khc); thu nhp phn theo khu vc kinh t v ngnh
kinh t.
- Chi tiu h gia nh: mc chi tiu, chi tiu phn theo mc ch chi v khon chi (chi cho n,
mc, , i li, gio dc, y t, vn ho, v.v v chi khc theo danh mc cc nhm/khon chi tiu
tnh quyn s ch s gi tiu dng).
- Trnh hc vn, trnh chuyn mn k thut ca tng thnh vin h gia nh.

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

- Tnh trng m au, bnh tt v s dng cc dch v y t.


- Tnh trng vic lm, thi gian lm vic.
- Ti sn, nh v cc tin nghi nh dng, in, nc, iu kin v sinh.
- Tham gia chng trnh xo i gim ngho, tnh hnh tn dng.
- Qun l iu hnh v qun l ri ro
b. i vi x
- Mt s tnh hnh chung v nhn khu, dn tc.
- Kt cu h tng kinh t - x hi ch yu, gm: hin trng in, ng, trng hc, trm y t,
ch, bu in, ngun nc.
- Tnh trng kinh t, gm: Tnh hnh sn xut nng nghip (t ai, xu hng v nguyn nhn
tng gim sn lung cc cy trng chnh, cc iu kin h tr pht trin sn xut nh ti tiu, khuyn
nng); c hi vic lm phi nng nghip.
- Mt s thng tin c bn v trt t an ton x hi v bo v mi trng.
1.3 Mu kho st
a. i tng, phm vi, thi im kho st
i tng kho st gm cc h gia nh, cc thnh vin h gia nh v cc x c cc h gia nh c
kho st. n v kho st gm h gia nh v x c chn kho st.
Phm vi kho st bao gm tt c cc a bn, cc x c chn thuc 64 tnh, thnh ph trc thuc trung
ng (sau y gi tt l tnh/thnh ph).
Thi im kho st gm hai k vo thng 5 v thng 9 nm 2009. Thi gian thu thp thng tin ti a
bn mi k ko di 2 thng.
b. Mu kho st
Mu 1: Mu kho st mc sng 2008 v tnh quyn s ch s gi tiu dng (CSGTD).
Mu ny chn t dn mu ch thit k cho cc cuc KSMS giai on 2000-2010 gm 3.063
x/phng, mi x/phng chn 3 a bn t cc a bn ca Tng iu tra Dn s v Nh nm
1999.
C ca Mu 1 gm 45.945 h c chn t 3.063 a bn ca dn mu ch, chia lm 2 loi:
- Mu thu nhp v quyn s CSGTD gm 36.756 h thu thp cc ni dung thng tin nu
trn v quyn s CSGTD, tr chi tiu ca h gia nh nh gi mc sng cp quc gia, vng v
tnh/thnh ph, ng thi tnh quyn s CSGTD. Mu ny phng vn Phiu s 1A-PVH/KSMS08;
- Mu thu nhp chi tiu gm 9.189 h thu thp y cc ni dung thng tin nh gi, phn
tch mc sng mt cch su hn cp quc gia v vng (khng c thng tin tnh quyn s
CSGTD). Mu ny phng vn Phiu s 1B-PVH/KSMS08.
Mu 2: Mu ch tnh quyn s CSGTD, gm 2 phn, Phn 1 gm 9.189 h gia nh c
chn thm t 3.063 a bn ca Mu 1, mi a bn chn 3 h gia nh; v Phn 2 gm 15.000 h

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

c chn t 1000 a bn ca Tng iu tra Dn s v Nh nm 1999 ngoi mu ch. Mu ny


phng vn Phiu s 1C-PVH/QS08.
c. Cc bc chn mu
i vi Mu 1:
Bc 1: Chn a bn.
Cc a bn ca Mu 1 s c chn theo cch lun phin, c th: chn li 50% s a bn ca
KSMS 2006 (trong c mt na s a bn c kho st c trong KSMS 2004 v 2006 v na s
a bn cn li ch c kho st trong KSMS 2006) v 50% s a bn cn li c chn mi hon
ton t dn mu ch, phn cha c chn vo mu ca KSMS 2004 v 2006.
V Thng k X hi v Mi trng chu trch nhim chn v gi danh sch a bn chn
cho cc Cc Thng k r sot v cp nht, trong c gi km c s v bng k ca Tng iu
tra Dn s v Nh nm 1999 ca cc a bn mi. Cc Cc Thng k tnh/thnh ph c th xem xt,
ngh iu chnh mt s a bn cho ph hp hn vi cc c im a l, kinh t, x hi thc t ca
a phng, nhng s a bn ngh iu chnh khng vt qu 5% tng s a bn ca tnh/thnh
ph v phi c s ng ca TCTK (V XHMT) trc khi tin hnh kho st.
Bc 2: Chn h.
Cc Thng k chn h kho st, c th:
- i vi nhng a bn chn li t KSMS 2006, chn tt c 15 h, trong 12 h kho st
thu nhp (h thu nhp) nm 2006 kho st thu nhp cho KSMS 2008 v 3 h kho st thu nhp
chi tiu (h thu nhp chi tiu) nm 2006 kho st thu nhp chi tiu cho KSMS 2008. Trong trng
hp c nhng h c kho st nm 2004 hoc 2006 nhng nay i khi a bn th phi chn h
d b thay th c s lng 12 h thu nhp v 3 h thu nhp chi tiu mi a bn kho st.
- i vi nhng a bn mi, chn 20 h t danh sch h cp nht ca a bn. T 20 h
c chn, chn 15 h (12 h chnh thc, 3 h d phng) kho st thu nhp; 5 h cn li (3 chnh
thc v 2 d phng) kho st thu nhp chi tiu.
Vic chn h kho st c thc hin theo phng php nu trong S tay hng dn nghip v
KSMS 2008.
i vi Mu 2:
- i vi Phn 1 ca Mu 2: Chn 5 h (3 h chnh thc v 2 h d b) t danh sch h cp
nht ca mi a bn trong 3.063 a bn ca Mu 1 (tr cc h c chn vo Mu 1) thu thp
thng tin tnh quyn s CSGTD..
- i vi Phn 2 ca Mu 2: chn 20 h t danh sch h cp nht ca mi a bn trong
1.000 a bn ca Phn 2 Mu 2. T 20 h c chn, chn 15 h chnh thc v 5 h d phng thu
thp thng tin tnh quyn s CSGTD.
Cc Thng k tnh/thnh ph s chia s a bn c phn b ca tng khu vc thnh th/nng
thn v vng a l cho 2 k kho st vo thng 5 v thng 9 nh sau: 2/3 a bn ca Mu 1, k c 3
h ca Phn 1 Mu 2 kho st vo k thng 5; s a bn cn li kho st vo k thng 9. Cc x c
a bn c chn phng vn h s ng thi tin hnh phng vn Phiu phng vn x.

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

Danh sch a bn v h c chn s c lu gi ti 2 a ch: Cc Thng k tnh/thnh ph


v V Thng k X hi v Mi trng phc v vic t chc thc hin v theo di, kim tra, gim
st.
Mu thu nhp v mu thu nhp chi tiu c phn b cho 2 thi im kho st nh sau:
Thi gian
Mu 1
Mu 1
Mu 2
Mu 2
Cng
thu thp
Thu nhp v
Thu
Phn 1
Phn 2
s liu
quyn s
nhp chi
CSGTD
tiu
TNG S
Chia ra:
36.756
9.189
9.189
15.000
70.134
Thng
5-6/2008

24.504

6.126

6.126

Thng
9-10/2008

12.252

3.063

3.063

36.756
15.000

33.378

1.4 Phng php thu thp d liu


Cuc kho st ny s dng hai loi phiu phng vn: loi phiu phng vn h gia nh v loi
phiu phng vn x. Loi phiu phng vn h gia nh gm: Phiu phng vn thu nhp chi tiu (p
dng cho mu thu nhp chi tiu) bao gm tt c cc thng tin ca ni dung kho st; Phiu phng vn
thu nhp v quyn s CSGTD (p dng cho mu thu nhp v quyn s CSGTD) gm cc thng tin ca
ni dung kho st tr cc thng tin v chi tiu ca h v thm thng tin tnh quyn s CSGTD; v
Phiu quyn s CSGTD (p dng cho mu ch thu thp thng tin tnh quyn s CSGTD). Phiu
phng vn c thit k tng i chi tit gip iu tra vin ghi chp thun li, ng thi trnh b st
cc khon mc v tng tnh thng nht gia cc iu tra vin, t nng cao cht lng s liu kho
st.
Cuc kho st p dng phng php phng vn trc tip. iu tra vin n h, gp ch h v
nhng thnh vin trong h c lin quan phng vn v ghi thng tin vo phiu phng vn h gia
nh. i trng i kho st phng vn lnh o x v cc cn b a phng c lin quan v ghi
thng tin vo phiu phng vn x. bo m cht lng thng tin thu thp, cuc kho st khng
chp nhn phng php kho st gin tip hoc sao chp cc thng tin t cc ngun c sn khc vo
phiu phng vn.

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

1.5 D liu
C 2 loi d liu chnh: d liu kho st x, v d liu kho st h. Chng ta s tm hiu v d liu
kho st h, v n c s dng kh ph bin. D liu kho st x cng c khai thc tng t.
Trong d liu kho st h, nhng ngi lm nghin cu thng hay s dng mu thu nhp v chi tiu
(9189 h) thc hin phn tch v c y d liu v tt c cc bin.
D liu VHLSS2008 do tng cc thng k cung cp thng c lu trong a CD. Sau khi chp sang
a C ca my tnh, c dng nh Hnh 1.
Hnh 1

Th mc cha d liu
kho st x/phng
Th mc cha d liu
kho st h
Bng cu hi x/phng
Trong th mc ny, c cc file excel
cho bit ni dung bng cu hi kho
st h
Hnh 2

Trong th mc ny, c cc file d


liu c nhp bng phn mm Stata
(tn file d liu ca stata c phn m
rng l .dta)
Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

2. Khi ng
Hnh 2.1

Hnh 2.2

Khi ng Stata?

khi ng Stata11, n gin, bn hy double-click vo biu tng StataSE.exe, hoc double-click


vo biu tng Shortcut ca Stata trn desktop

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

Mn hnh STATA?

Ca s Review: ca s ny s lit k
cc lnh trong qu kh bn s dng

Ca s Results: ca s ny hin cc kt
qu tnh ton, cc thng bo ca Stata

Hnh 2.3
Thanh Menu ca Stata
Thanh Cng c ca Stata

Ca s Variables: Ca s ny s lit k danh


sch cc bin ca file d liu m bn ang m

Thot khi Stata?


\- Hy th g lnh exit vo ca s lnh! Hoc Bm nt

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ca s Command: dng g cc lnh


ca Stata

trong Hnh 2.3

Ghi ch bi ging

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

3. Mt vi lnh Qun l d liu n gin

Khai bo dung lng b nh dnh cho Stata?

- Trong ca s lnh Hnh 2.3, bn ang g cu lnh set mem 300m


Khi g lnh ny, bn mun my tnh dnh cho Stata 300 megabytes b nh
Cu trc lnh c bn: set mem #[b|k|m|g]
Vi # l s bytes, kilobytes, megabytes, hay gigabytes ( tng ng vi b, k,
m, hay g c g pha sau), mc nh l k

M 1 file d liu?

Cch 1
T thanh Menu ca Stata, chn File\Open Ch ng dn n file cn m Open
V d. Hnh 3.1 ch ra ng dn ca file d liu muc123a.dta trong th mc
C:\VHLSS2008\Data\Hhold
Hnh 3.1

Cch 2
Bn hy g lnh sau vo ca s lnh ca Stata:
use "C:\VHLSS2008\Data\Hhold\muc123a.dta", clear

Xem thng tin s b v cc bin (tn bin, nhn bin, kiu d liu)?

- Bn hy ko thanh trt Ca s Variables C nhng bin g trong file muc123a.dta nh?


- G lnh des vo ca s lnh bn s thy nhng thng tin sau Ca s kt qu:

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

Bng 3.1
Contains data from C:\VHLSS2008\Data\Hhold\muc123a.dta
obs:
38,253
vars:
56
11 Mar 2010 15:26
size:
4,934,637 (98.4% of memory free)
-------------------------------------------------------------------------------storage display
value
variable name
type
format
label
variable label
-------------------------------------------------------------------------------tinh
int
%8.0g
huyen
byte
%8.0g
xa
double %8.0g
diaban
int
%8.0g
hoso
int
%8.0g
matv
byte
%8.0g
M hiu
m1ac2
byte
%8.0g
M1AC2
2. Gii tnh
m1ac3
byte
%14.0g
M1AC3
3. Quan h
m1ac4a
byte
%8.0g
4. Thng sinh
m1ac4b
int
%8.0g
Nm sinh
m1ac5
int
%8.0g
5. Tui
m1ac6
byte
%10.0g
M1AC6
6. Hn nhn
m1ac7
byte
%8.0g
7. Thng h
m1ac7a
byte
%26.0g
M1AC7A
7a. L do
m1ac8
byte
%16.0g
M1AC8
8. H khu
m1ac9
int
%18.0g
M1AC9
9. Noi dang ky HK
m1ac10a
int
%8.0g
10. Nm tnh
m1ac10b
byte
%8.0g
10. Thng tnh
m2ac1
byte
%26.0g
M2AC1
1.Hc ht lp
m2ac2
byte
%8.0g
M2AC2
2.Bit c, bit vit
m2ac3a
byte
%11.0g
M2AC3A
3.Bng cp cao nht - GDPT
m2ac3b
byte
%14.0g
M2AC3B
Bng cp cao nht - GDNN
m2ac4
byte
%8.0g
M2AC4
4.Loi trng TN
m2ac5
byte
%8.0g
M2AC5
5.Hin c i hc
m2ac6
byte
%8.0g
M2AC6
6.12 thng qua c i hc
m2ac7
byte
%17.0g
M2AC7
7.L do k i hc
m2ac8
byte
%14.0g
M2AC8
8.H/cp/bc ang hc
m2ac9
byte
%8.0g
M2AC9
9.Loi trng
m2ac10
byte
%8.0g
M2AC10
10.C min gim
m2ac11a
byte
%18.0g
M2AC11A
11.L do min gim hc ph
m2ac11b
byte
%18.0g
M2AC11B
L do min gim ng gp
m2ac12a
int
%8.0g
12.% min gim hc ph
m2ac12b
int
%8.0g
% min gim ng gp
m2ac13a
long
%12.0g
13a.Chi hc ph
m2ac13b
long
%12.0g
13b.Chi tri tuyn
m2ac13c
long
%12.0g
13c.Chi ng gp
m2ac13d
long
%12.0g
13d.Chi qu
m2ac13e
long
%12.0g
13e.Chi ng phc
m2ac13f
long
%12.0g
13f.Chi sch gio khoa
m2ac13g
long
%12.0g
13g.Chi dng c hc tp
m2ac13g1
long
%12.0g
13g1. Giy v, s
m2ac13g2
long
%12.0g
13g2. Cp, bt
m2ac13g3
long
%12.0g
13g3. My tnh, sch .t
m2ac13h
long
%12.0g
13h.Chi hc thm
m2ac13i
long
%12.0g
13i.Chi gio dc khc
m2ac13i1
long
%12.0g
13i1.Chi nh tr SV
m2ac13k
long
%12.0g
13k.Tng s (a+b+...+i)
m2ac14
long
%12.0g
14.Cc khon nhn
m2ac15
long
%12.0g
15.Gi tr hc bng
m2ac16
long
%12.0g
16.Chi gio dc-o to khc
m3c1
byte
%8.0g
M3C1
1. 4 tun, c b m/bnh
m3c2
byte
%8.0g
M3C2
2. 12 thng, c b m/bnh
m3c3a
int
%8.0g
3. S ngy nm 1 ch
m3c3b
int
%8.0g
S ngy ngh vic
m3c4
byte
%8.0g
M3C4
4. C BHYT min ph
m3c5
byte
%29.0g
M3C5
5. Loi BHYT
-------------------------------------------------------------------------------Sorted by: tinh huyen xa diaban hoso

- Bn hy m cc file (v tm n sheet trong file tng ng) bng cu hi iu tra (cc file Excel, v
d Muc01_1B.xls, Muc02_1B.xls, Muc03_1B.xls) lin quan n cc bin file d liu m bn ang
m (V d, file muc123a.dta) bit thm chi tit v cc bin.

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

10

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

Hnh 3.2

Hnh 3.3

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

11

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

- Trong Hnh 3.2 (file d liu th hin Mc 1A ca bng cu hi), bn hy th xem cu 2 l cu hi v


iu g? Gii tnh ca thnh vin trong h. V quy c m ho khi nhp liu nh sau: Nam th nhp
1, N th nhp 2. S c bin m1ac2 cha ng thng tin v gii
- Trong Hnh 3.2 (file d liu th hin Mc 1A ca bng cu hi), bn hy th xem cu 5 l cu hi v
iu g? Tui ca thnh vin S c bin m1ac5 cho bit thnh vin ca h tui no.
- Theo bn bin m1ac2, m1ac5 l bin nh tnh hay bin nh lng? C l bn s tr li l m1ac2
l bin nh tnh, cn m1ac5 l bin nh lng.
- bn hy th g lnh des m1ac2 m1ac5 m2ac6 xem iu g xy ra?
. des m1ac2 m1ac5 m2ac6
storage display
value
variable name
type
format
label
variable label
-------------------------------------------------------------------------------m1ac2
byte
%8.0g
M1AC2
2. Gii tnh
m1ac5
int
%8.0g
5. Tui
m2ac6
byte
%8.0g
M2AC6
6.12 thng qua c i hc

Xem thng tin s b v cc bin (cc gi tr ca bin) - lnh codebook?

codebook m1ac2 m1ac5


-------------------------------------------------------------------------------m1ac2
2. Gii tnh
-------------------------------------------------------------------------------type:
label:

numeric (byte)
M1AC2

range:
unique values:

[1,2]
2

tabulation:

Freq.
18810
19443

units:
missing .:
Numeric
1
2

1
0/38253

Label
Nam
N

-------------------------------------------------------------------------------m1ac5
5. Tui
-------------------------------------------------------------------------------type:
range:
unique values:
mean:
std. dev:
percentiles:

numeric (int)
[0,103]
102

units:
missing .:

1
0/38253

31.784
20.6508
10%
7

25%
15

50%
28

75%
46

90%
60

m s quan st trong b d liu? lnh count

- Khi g lnh count vo ca s lnh, bn s thy thng tin sau trn ca s kt qu


. count
38253
Xem d liu? M ca s Data Editor

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

12

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

Hnh 3.4

xem d liu, bn c th bm nt Data Editor, hoc g lnh edit vo ca s lnh


Hnh 3.5

Hnh 3.6

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

13

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

Hnh 3.7

T Hnh 3.6, nu ban mun d liu c th hin nh Hnh 3.7 th click chut phi chn Value
lable Hide All Value Labels
- Khi mun g lnh g tip theo trong ca s lnh, bn nn ng ca s Data Editor li.
4. To bng tn s
To bng tn s mt chiu?
. tab m1ac2
2. Gii |
tnh |
Freq.
Percent
Cum.
------------+----------------------------------Nam |
18,810
49.17
49.17
N |
19,443
50.83
100.00
------------+----------------------------------Total |
38,253
100.00

. tab m2ac1
1.Hc ht lp |
Freq.
Percent
Cum.
---------------------------+----------------------------------Cha ht lp 1/cha i hc |
5,664
14.81
14.81
1 |
945
2.47
17.28
2 |
1,680
4.39
21.67
3 |
1,985
5.19
26.86
4 |
2,029
5.30
32.16
5 |
3,200
8.37
40.53
6 |
2,316
6.05
46.58
7 |
2,337
6.11
52.69
8 |
1,987
5.19
57.89
9 |
6,692
17.49
75.38
10 |
1,336
3.49
78.87
11 |
1,223
3.20
82.07
TN THPT |
6,859
17.93
100.00
---------------------------+----------------------------------Total |
38,253
100.00

. tab m2ac6
6.12 thng |
qua c i |
hc |
Freq.
Percent
Cum.
------------+----------------------------------C |
617
2.18
2.18
Khng |
27,695
97.82
100.00
------------+----------------------------------Total |
28,312
100.00

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

14

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

. tab m2ac6, m
6.12 thng |
qua c i |
hc |
Freq.
Percent
Cum.
------------+----------------------------------C |
617
1.61
1.61
Khng |
27,695
72.40
74.01
. |
9,941
25.99
100.00
------------+----------------------------------Total |
38,253
100.00
. tab m2ac9 if m2ac6==1
9.Loi |
trng |
Freq.
Percent
Cum.
------------+----------------------------------Cng lp |
514
83.31
83.31
Bn cng |
55
8.91
92.22
Dn lp |
32
5.19
97.41
T thc |
9
1.46
98.87
Khc |
7
1.13
100.00
------------+----------------------------------Total |
617
100.00
. tab m2ac9 if m2ac6==1, nol
9.Loi |
trng |
Freq.
Percent
Cum.
------------+----------------------------------1 |
514
83.31
83.31
2 |
55
8.91
92.22
3 |
32
5.19
97.41
4 |
9
1.46
98.87
5 |
7
1.13
100.00
------------+----------------------------------Total |
617
100.00

To bng tn s v tnh trng hn nhn phn theo nam v n?


. sort m1ac2
. by m1ac2: tab m1ac6
--------------------------------------------------------------------------------> m1ac2 = Nam
6. Hn nhn |
Freq.
Percent
Cum.
------------+----------------------------------Cha VC |
5,535
36.81
36.81
ang c VC |
9,082
60.41
97.22
Go |
314
2.09
99.31
Ly hn |
60
0.40
99.71
Ly thn |
44
0.29
100.00
------------+----------------------------------Total |
15,035
100.00
--------------------------------------------------------------------------------> m1ac2 = N
6. Hn nhn |
Freq.
Percent
Cum.
------------+----------------------------------Cha VC |
4,568
28.78
28.78
ang c VC |
9,209
58.01
86.79
Go |
1,798
11.33
98.12
Ly hn |
205
1.29
99.41
Ly thn |
94
0.59
100.00
------------+----------------------------------Total |
15,874
100.00

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

15

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

To bng tn s hai chiu?


. tab m2ac1 m1ac2
|
2. Gii tnh
1.Hc ht lp |
Nam
N |
Total
----------------------+----------------------+---------Cha ht lp 1/cha |
2,430
3,234 |
5,664
1 |
430
515 |
945
2 |
691
989 |
1,680
3 |
862
1,123 |
1,985
4 |
948
1,081 |
2,029
5 |
1,505
1,695 |
3,200
6 |
1,179
1,137 |
2,316
7 |
1,166
1,171 |
2,337
8 |
1,036
951 |
1,987
9 |
3,393
3,299 |
6,692
10 |
714
622 |
1,336
11 |
701
522 |
1,223
TN THPT |
3,755
3,104 |
6,859
----------------------+----------------------+---------Total |
18,810
19,443 |
38,253
. tab m2ac1 m1ac2, col nof
|
2. Gii tnh
1.Hc ht lp |
Nam
N |
Total
----------------------+----------------------+---------Cha ht lp 1/cha |
12.92
16.63 |
14.81
1 |
2.29
2.65 |
2.47
2 |
3.67
5.09 |
4.39
3 |
4.58
5.78 |
5.19
4 |
5.04
5.56 |
5.30
5 |
8.00
8.72 |
8.37
6 |
6.27
5.85 |
6.05
7 |
6.20
6.02 |
6.11
8 |
5.51
4.89 |
5.19
9 |
18.04
16.97 |
17.49
10 |
3.80
3.20 |
3.49
11 |
3.73
2.68 |
3.20
TN THPT |
19.96
15.96 |
17.93
----------------------+----------------------+---------Total |
100.00
100.00 |
100.00
. tab m2ac2 m1ac2, col
2.Bit |
c, bit |
2. Gii tnh
vit |
Nam
N |
Total
-----------+----------------------+---------C |
2,830
3,485 |
6,315
|
52.79
50.20 |
51.33
-----------+----------------------+---------Khng |
2,531
3,457 |
5,988
|
47.21
49.80 |
48.67
-----------+----------------------+---------Total |
5,361
6,942 |
12,303
|
100.00
100.00 |
100.00

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

16

Chng trnh ging dy Kinh t Fulbright


. tab m2ac9

Lp MPP3 hc k Thu 2010

m1ac2 if m2ac6==1

9.Loi |
2. Gii tnh
trng |
Nam
N |
Total
-----------+----------------------+---------Cng lp |
282
232 |
514
Bn cng |
32
23 |
55
Dn lp |
19
13 |
32
T thc |
7
2 |
9
Khc |
5
2 |
7
-----------+----------------------+---------Total |
345
272 |
617
. tab m2ac9

m1ac2 if m2ac6==1, col

9.Loi |
2. Gii tnh
trng |
Nam
N |
Total
-----------+----------------------+---------Cng lp |
282
232 |
514
|
81.74
85.29 |
83.31
-----------+----------------------+---------Bn cng |
32
23 |
55
|
9.28
8.46 |
8.91
-----------+----------------------+---------Dn lp |
19
13 |
32
|
5.51
4.78 |
5.19
-----------+----------------------+---------T thc |
7
2 |
9
|
2.03
0.74 |
1.46
-----------+----------------------+---------Khc |
5
2 |
7
|
1.45
0.74 |
1.13
-----------+----------------------+---------Total |
345
272 |
617
|
100.00
100.00 |
100.00
. tab m2ac9

m1ac2 if m2ac6==1, col nof

9.Loi |
2. Gii tnh
trng |
Nam
N |
Total
-----------+----------------------+---------Cng lp |
81.74
85.29 |
83.31
Bn cng |
9.28
8.46 |
8.91
Dn lp |
5.51
4.78 |
5.19
T thc |
2.03
0.74 |
1.46
Khc |
1.45
0.74 |
1.13
-----------+----------------------+---------Total |
100.00
100.00 |
100.00

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

17

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

5. Tnh cc thng k m t
Tnh thng k m t ca mt bin nh lng?
.sum m1ac5
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------m1ac5 |
38253
31.78399
20.65079
0
103
. sum

m2ac13k

Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------m2ac13k |
10558
1608.373
2669.863
0
46160
. sum m1ac5 m2ac13k
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------m1ac5 |
38253
31.78399
20.65079
0
103
m2ac13k |
10558
1608.373
2669.863
0
46160
. sum m1ac5 m2ac13k if m2ac6==1
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------m1ac5 |
617
17.87358
6.121887
0
49
m2ac13k |
617
2244.407
2853.302
0
32000

Tnh thng k m t ca mt bin nh lng phn theo mt bin nh tnh?


Cch 1
. tab m1ac2, sum(m1ac5)
2. Gii |
Summary of 5. Tui
tnh |
Mean
Std. Dev.
Freq.
------------+-----------------------------------Nam |
30.419139
19.914699
18810
N |
33.104408
21.256024
19443
------------+-----------------------------------Total |
31.783991
20.650785
38253
. tab

m2ac9 if m2ac6==1, sum (m2ac13k)

9.Loi | Summary of 13k.Tng s (a+b+...+i)


trng |
Mean
Std. Dev.
Freq.
------------+-----------------------------------Cng lp |
2245.072
2741.4057
514
Bn cng |
1838.8727
1083.6352
55
Dn lp |
2423.1563
2293.5377
32
T thc |
4997
10167.356
9
Khc |
1025.7143
1711.2944
7
------------+-----------------------------------Total |
2244.4068
2853.3015
617

Cch 2.
. by m1ac2: sum m1ac5
-> m1ac2 = Nam
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------m1ac5 |
18810
30.41914
19.9147
0
97
--------------------------------------------------------------------------------> m1ac2 = N
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------m1ac5 |
19443
33.10441
21.25602
0
103

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

18

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

Tnh thng k m t ca mt bin nh lng phn theo 2 bin nh tnh?

. table

m2ac8 m2ac9 if m2ac6==1, c( mean m2ac13k) format(%7.1f)

----------------------------------------------------------------8.H/cp/bc
|
9.Loi trng
ang hc
| Cng lp Bn cng
Dn lp
T thc
Khc
---------------+------------------------------------------------Nh tr, MG |
945.9
886.6
660.0
1966.7
Tiu hc |
336.4
240.0
THCS |
687.4
THPT |
1518.4
2087.1
2631.7
8928.3
555.0
S cp ngh |
4931.7
2840.0
0.0
0.0
Trung cp ngh |
5081.5
2900.0
1680.0
0.0
TH CN |
3561.2
2490.0
1410.0
Cao ng ngh |
4153.3
430.0
Cao ng |
4071.6
1820.0
i hc |
5145.9
3895.0
4660.0
Thc s |
9930.0
Tin s | 21000.0
----------------------------------------------------------------. table

m2ac8 m2ac9 if m2ac6==1, c(count m2ac13k) format(%7.1f)

----------------------------------------------------------------8.H/cp/bc
|
9.Loi trng
ang hc
| Cng lp Bn cng
Dn lp
T thc
Khc
---------------+------------------------------------------------Nh tr, MG |
26
12
2
3
Tiu hc |
21
1
THCS |
64
THPT |
262
42
18
4
2
S cp ngh |
15
1
1
2
Trung cp ngh |
11
2
2
1
TH CN |
32
2
1
Cao ng ngh |
3
1
Cao ng |
12
1
i hc |
65
4
1
Thc s |
2
Tin s |
1
. table

m2ac8 if m2ac6==1, c(count m2ac13k mean m2ac13k) format(%7.1f)

--------------------------------------------8.H/cp/bc
|
ang hc
|
N(m2ac13k) mean(m2ac13k)
---------------+----------------------------Nh tr, MG |
43
987.3
Tiu hc |
22
332.0
THCS |
64
687.4
THPT |
328
1736.8
S cp ngh |
19
4042.9
Trung cp ngh |
16
4066.0
TH CN |
35
3438.5
Cao ng ngh |
4
3222.5
Cao ng |
13
3898.4
i hc |
70
5067.5
Thc s |
2
9930.0
Tin s |
1
21000.0
---------------------------------------------

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

19

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

Trong tu chn ca mt s lnh, Stata cho php cc loi thng k c ch ra bi cc thng k nh sau:
C php thng k

ngha

mean

Trung bnh mean

count

m s quan st

Ging nh lnh count (m s quan st)

sum

Tng cng

max

Gi tr ln nht

min

Gi tr nh nht

range

Bin = Gi tr ln nht - Gi tr nh nht

sd

lch chun

sdmean

lch chun ca trung bnh = lch chun / {(S quan st)^0.5}

skewness

lch ca phn phi

kurtosis

nhn

median

Trung v (Ging nh p50)

p1

1% phn v

p5

5% phn v

p10

10% phn v

p25

25% phn v

p50

50% phn v (trung v)

p75

75% phn v

p90

90% phn v

p95

95% phn v

p99

99% phn v

iqr

p75 - p25

tng ng vi "p25 p50 p75"

V d:
tabstat m1ac5, stats (mean median iqr sd)
tabstat m1ac5, stats (mean median min max range sd var cv skewness kurtosis)
table m2ac8 m2ac9 if m2ac5<=2, c( mean m2ac13k)
table m2ac8 m2ac9 if m2ac5<=2, c( mean m2ac13k) format(%7.2f)
table m2ac8 m2ac9 if (m2ac5<=2) & (m2ac9<=4), c( mean m2ac13k) format(%7.2f)
table m2ac8 m2ac9 if m2ac5<=2, c(count m2ac13k) format(%7.1f)
table m2ac8 if m2ac5<=2, c(count m2ac13k mean m2ac13k) format(%7.1f)
table m2ac8 if m2ac5<=2, c(count m2ac13k mean m2ac13k mean m1ac5 ) format(%7.1f)
tab m1ac2, sum( m1ac5) mean
tabstat

m1ac5, stats (mean median iqr sd)

variable |
mean
p50
iqr
sd
-------------+---------------------------------------m1ac5 | 31.78399
28
31 20.65079
------------------------------------------------------

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

20

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

table m2ac8 m2ac9 if m2ac5 <= 2, c( mean m2ac13k) format(%7.2f)


--------------------------------------------------------------------------8.H/cp/bc
|
9.Loi trng
ang hc
| Cng lp Bn cng
Dn lp
T thc
Khc
Missing
---------------+----------------------------------------------------------Nh tr, MG |
723.19
688.55
1154.58
3291.18
1158.29
0.00
Tiu hc |
539.26
923.67 16515.43 12830.00
1398.80
952.50
THCS |
824.08
1559.52 10880.78
699.33
425.00
THPT | 1453.13
2174.89
3196.20
5483.21
1167.00
S cp ngh | 3077.13
3034.50
3400.00
3625.00
3039.29
Trung cp ngh | 4113.61
6291.67
3922.00
3100.00 13980.00
TH CN | 4214.58
6438.33
4585.47
3400.00 11830.00
Cao ng ngh | 4898.46
5122.50
4994.60
4973.33
2850.00
Cao ng | 4986.06
7755.50
4644.58
5626.67 11712.50
i hc | 5892.89
8702.14
7845.68
1973.00
4773.42
Thc s | 11495.33
Khc |
640.00
---------------------------------------------------------------------------

. tab m1ac2, sum( m1ac5) mean


| Summary of
2. Gii |
5. Tui
tnh |
Mean
------------+-----------Nam |
30.419139
N |
33.104408
------------+-----------Total |
31.783991

6. S lc v hi quy

To bin gi?

Cch th cng nht nh sau:


. gen gioi= m1ac2
. replace gioi=0 if m1ac2==2
(19443 real changes made)

c lng hm hi quy?

. reg m2ac13k

gioi

m1ac5

Source |
SS
df
MS
-------------+-----------------------------Model | 1.3691e+10
2 6.8455e+09
Residual | 6.1561e+10 10555 5832410.47
-------------+-----------------------------Total | 7.5252e+10 10557 7128167.96

Number of obs
F( 2, 10555)
Prob > F
R-squared
Adj R-squared
Root MSE

=
10558
= 1173.70
= 0.0000
= 0.1819
= 0.1818
=
2415

-----------------------------------------------------------------------------m2ac13k |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------gioi |
75.52587
47.04069
1.61
0.108
-16.68276
167.7345
m1ac5 |
196.9525
4.070503
48.39
0.000
188.9736
204.9315
_cons | -989.2827
62.38184
-15.86
0.000
-1111.563
-867.0025
------------------------------------------------------------------------------

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

21

Chng trnh ging dy Kinh t Fulbright


. reg m2ac13k

gioi

m1ac5 if

Lp MPP3 hc k Thu 2010


m2ac5<=2

Source |
SS
df
MS
-------------+-----------------------------Model | 1.2967e+10
2 6.4834e+09
Residual | 5.7005e+10 9938 5736081.11
-------------+-----------------------------Total | 6.9972e+10 9940 7039428.34

Number of obs
F( 2, 9938)
Prob > F
R-squared
Adj R-squared
Root MSE

=
9941
= 1130.28
= 0.0000
= 0.1853
= 0.1851
=
2395

-----------------------------------------------------------------------------m2ac13k |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------gioi |
68.67071
48.06328
1.43
0.153
-25.54307
162.8845
m1ac5 |
203.2432
4.278268
47.51
0.000
194.8569
211.6295
_cons | -1045.171
64.07087
-16.31
0.000
-1170.763
-919.5791
------------------------------------------------------------------------------

Tnh h s tng quan?

. cor m2ac13k m1ac5 if m2ac6<=2


(obs=617)
| m2ac13k
m1ac5
-------------+-----------------m2ac13k |
1.0000
m1ac5 |
0.3382
1.0000

V th phn tn?

scatter m2ac13k m1ac5 if m2ac6==1

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

22

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

7. Ni 2 file d liu bng lnh Merge


Gi s, chng hn bn mun ni file muc4a.dta vo muc123a.dta
Bc 1. M file using, sort, lu li ti mt th mc khc
. use "C:\VHLSS2008\Data\Hhold\muc4a.dta", clear
. count
35154
. sort

tinh huyen xa diaban hoso matv

. save "C:\VHLSS2008\muc4a_sorted.dta", replace


file C:\VHLSS2008\muc4a_sorted.dta saved
Bc 2. M file master, sort, dng lnh merge ni
. use "C:\VHLSS2008\Data\Hhold\muc123a.dta", clear
. count
38253
. sort tinh huyen xa diaban hoso matv
. merge 1:1 tinh huyen xa diaban hoso matv using
"C:\VHLSS2008\muc4a_sorted.dta"
Result
# of obs.
----------------------------------------not matched
3,099
from master
3,099 (_merge==1)
from using
0 (_merge==2)
matched

35,154

(_merge==3)

Bc 3. Kim tra li, xo nhng quan st khng cn thit, xo bin


_merge
. tab _merge
_merge |
Freq.
Percent
Cum.
------------------------+----------------------------------master only (1) |
3,099
8.10
8.10
matched (3) |
35,154
91.90
100.00
------------------------+----------------------------------Total |
38,253
100.00
. keep if _merge==3
(3099 observations deleted)
. drop

_merge

Bn hy vo help tm hiu thm v lnh merge trn stata 11!

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

23

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

Hnh 7.1

Hnh 7.2

Hnh 7.4

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

24

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

8. Tr gip

Stata online: http://www.ats.ucla.edu/stat/stata/ v rt nhiu trang khc!

Hnh 8.1

Th vin chng trnh FETP

Bn c th vo Mc Help\Contents ca Stata hc tm hiu thm v stata.

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

25

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

Hnh 8.2

C th tra cu tng cu lnh bng cch Help\Command


Hnh 8.3

Cc Sch, ti liu, bi ging m ging vin gii thiu bn


Trao i vi cc chuyn gia trn din n thng tin pht trin Vit Nam: http://divietnam.org
V Google!

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

26

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

Ph lc 1.
Hi quy nhng cu lnh c bn trn Stata

S dng file muc1234a_hhexpe08.dta m bn to ra thc hin cc


cng vic(ni cc file muc123a.dta, muc4a.dta v hhexpe08.dta)
1. Chun b d liu
. egen chigd=rowtotal( m2ac13k m2ac16)
. gen gioi=

m1ac2

. replace gioi=0 if m1ac2==2


(17952 real changes made)
. gen tuoi= m1ac5
. gen tuoibp= tuoi^2
. gen thanhthi= urban08
. recode thanhthi 2 =0
(thanhthi: 26301 changes made)
. tab reg8, gen(vung)
reg8 |
Freq.
Percent
Cum.
------------+----------------------------------1 |
6,812
19.38
19.38
2 |
5,036
14.33
33.70
3 |
1,891
5.38
39.08
4 |
3,802
10.82
49.90
5 |
3,304
9.40
59.30
6 |
2,500
7.11
66.41
7 |
4,714
13.41
79.82
8 |
7,095
20.18
100.00
------------+----------------------------------Total |
35,154
100.00
Cu 9. Thc hin hm hi quy
2. Tnh h s tng quan
. pwcorr chigd tuoi hhsize, sig
|
chigd
tuoi
hhsize
-------------+--------------------------chigd |
1.0000
|
|
tuoi | -0.2393
1.0000
|
0.0000
|
hhsize | -0.0177 -0.1926
1.0000
|
0.0009
0.0000

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

27

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

3. V th Scatter
scatter chigd tuoi

graph matrix chigd tuoi hhsize, half

4. c lng hm hi quy
reg chigd gioi tuoi tuoibp thanhthi vung1 vung2 vung3 vung4 vung5 vung7 vung8
hhsize if m2ac5<= 2

M hnh 1
Source |
SS
df
MS
-------------+-----------------------------Model | 2.0005e+10
12 1.6671e+09
Residual | 5.3103e+10 9006 5896435.89
-------------+-----------------------------Total | 7.3109e+10 9018 8106973.15

Number of obs
F( 12, 9006)
Prob > F
R-squared
Adj R-squared
Root MSE

=
=
=
=
=
=

9019
282.73
0.0000
0.2736
0.2727
2428.3

-----------------------------------------------------------------------------chigd |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------gioi |
52.54772
51.30177
1.02
0.306
-48.01542
153.1109
tuoi |
347.9577
15.20595
22.88
0.000
318.1506
377.7648
tuoibp | -3.115513
.4194744
-7.43
0.000
-3.937779
-2.293248
thanhthi |
1081.325
61.32333
17.63
0.000
961.1169
1201.532
vung1 |
163.8409
104.2717
1.57
0.116
-40.55528
368.237
vung2 | -273.0126
110.2669
-2.48
0.013
-489.1608
-56.86438
vung3 | -655.3806
137.272
-4.77
0.000
-924.4648
-386.2963
vung4 | -30.02082
110.9621
-0.27
0.787
-247.5317
187.49
vung5 |
77.38069
115.4724
0.67
0.503
-148.9715
303.7329
vung7 |
958.442
111.6069
8.59
0.000
739.6671
1177.217

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

28

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

vung8 |
21.29995
107.4682
0.20
0.843
-189.3622
231.9621
hhsize | -76.84445
17.67135
-4.35
0.000
-111.4843
-42.20459
_cons | -2364.973
185.9853
-12.72
0.000
-2729.547
-2000.4
------------------------------------------------------------------------------

4. Kim nh Wald
C ngi cho rng quy m h (hhsize) v thanhthi u khng nh hng n chigd.
Theo bn, iu l ng hay sai
. test thanhthi= hhsize=0
( 1)
( 2)

thanhthi - hhsize = 0
thanhthi = 0
F(

2, 9006) =
Prob > F =

174.06
0.0000

5. Kim nh hin tng a cng tuyn


Bn hy kim nh xem m hnh 1 c b vi phm hin tng a cng tuyn?
Sau khi c lng m hnh, bn hy g lnh VIF
. vif
Variable |
VIF
1/VIF
-------------+---------------------tuoi |
9.20
0.108700
tuoibp |
9.18
0.108991
vung1 |
2.62
0.382354
vung8 |
2.30
0.435498
vung2 |
2.19
0.456844
vung4 |
2.13
0.468623
vung7 |
2.13
0.469353
vung5 |
1.97
0.508528
vung3 |
1.53
0.653950
hhsize |
1.08
0.926313
thanhthi |
1.07
0.931843
gioi |
1.01
0.994199
-------------+---------------------Mean VIF |
3.03
6. Kim nh hin tng phng sai thay i
Hy kim nh xem m hnh 1 c b vi phm hin tng phng sai thay i?
. hettest
Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
Ho: Constant variance
Variables: fitted values of chigd
chi2(1)
Prob > chi2

=
=

6006.98
0.0000

. imtest
Cameron & Trivedi's decomposition of IM-test
--------------------------------------------------Source |
chi2
df
p
---------------------+----------------------------Heteroskedasticity |
393.87
59
0.0000
Skewness |
104.13
12
0.0000
Kurtosis |
13.30
1
0.0003
---------------------+----------------------------Total |
511.31
72
0.0000
---------------------------------------------------

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

29

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

7. S dng option robust sau lnh reg khc phc hin tng phng sai thay i
Hy c lng li m hnh 1 m c th khc phc hin tng phng sai thay i
reg chigd gioi tuoi tuoibp thanhthi vung1 vung2 vung3 vung4 vung5 vung7 vung8
hhsize if m2ac5<= 2, robust
Linear regression

Number of obs
F( 12, 9006)
Prob > F
R-squared
Root MSE

=
=
=
=
=

9019
185.08
0.0000
0.2736
2428.3

-----------------------------------------------------------------------------|
Robust
chigd |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------gioi |
52.54772
51.17439
1.03
0.305
-47.76572
152.8612
tuoi |
347.9577
21.66313
16.06
0.000
305.493
390.4224
tuoibp | -3.115513
.682129
-4.57
0.000
-4.452641
-1.778385
thanhthi |
1081.325
74.50076
14.51
0.000
935.2862
1227.363
vung1 |
163.8409
87.9385
1.86
0.062
-8.53859
336.2203
vung2 | -273.0126
87.90441
-3.11
0.002
-445.3252
-100.7
vung3 | -655.3806
92.42965
-7.09
0.000
-836.5637
-474.1974
vung4 | -30.02082
89.04394
-0.34
0.736
-204.5672
144.5256
vung5 |
77.38069
100.2281
0.77
0.440
-119.0891
273.8505
vung7 |
958.442
143.1455
6.70
0.000
677.8443
1239.04
vung8 |
21.29995
102.6027
0.21
0.836
-179.8247
222.4246
hhsize | -76.84445
15.91004
-4.83
0.000
-108.0317
-45.65715
_cons | -2364.973
202.2426
-11.69
0.000
-2761.415
-1968.532
------------------------------------------------------------------------------

8. Lu li phn d ca m hnh v kim nh tnh phn phi chun ca sai s


Lu li phn d trong bin r, v th phn phi ca phn d, tnh thng k
skewness v kurtosis cho bin r
. predict r, resid
. histogram r, normal
(bin=45, start=-9364.6592, width=1145.6248)

. tabstat r,stat(skewness kurtosis)


variable | skewness kurtosis
-------------+-------------------r |
1.68251 13.18075
----------------------------------

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

30

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

. sktest r
Skewness/Kurtosis tests for Normality
------- joint -----Variable |
Obs
Pr(Skewness)
Pr(Kurtosis) adj chi2(2)
Prob>chi2
-------------+--------------------------------------------------------------r |
3.5e+04
0.0000
0.0000
.
.

. swilk r
Shapiro-Wilk W test for normal data
Variable |
Obs
W
V
z
Prob>z
-------------+-------------------------------------------------r | 35154
0.88757
1583.543
20.303
0.00000

Ghi ch: Sau khi thc hin hi quy, g lnh predict tn_bin kt hp vi cc
option sau c th to ra nhng bin mi lin quan n m hnh:
to ra
Gi tr d bo ca Y
Phn d
Phn d chun ho
Phn d student ho
Leverage
Sai s chun ca phn d
Cooks D
Sai s chun ca gi tr d bo (c bit)
Sai s chun ca gi tr d bo (trung bnh)

Thm option sau lnh predict


Khng cn option
resid
rstandard
Rstudent
Lev hoc hat
Stdr
Cooksd
Stdf
stdp

C th s dng th p, th q?
pnorm r

qnorm r

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

31

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

9. c lng m hnh hi quy c s dng trng s trong iu tra VHLSS


Hy c lng li m hnh trn, v ch n vn trng s trong VHLSS
reg chigd gioi tuoi tuoibp thanhthi vung1 vung2 vung3 vung4 vung5 vung7 vung8
hhsize if m2ac5<=2 [pw= hhszwt], robust
(sum of wgt is
9.8756e+07)
Linear regression

Number of obs
F( 12, 9006)
Prob > F
R-squared
Root MSE

=
=
=
=
=

9019
122.22
0.0000
0.2497
2651.2

-----------------------------------------------------------------------------|
Robust
chigd |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------gioi |
35.68653
72.54715
0.49
0.623
-106.5224
177.8955
tuoi |
341.5655
29.01901
11.77
0.000
284.6816
398.4493
tuoibp | -3.148191
.888906
-3.54
0.000
-4.890649
-1.405733
thanhthi |
1245.069
101.016
12.33
0.000
1047.055
1443.084
vung1 |
283.2034
90.92591
3.11
0.002
104.968
461.4389
vung2 | -202.6547
86.93577
-2.33
0.020
-373.0686
-32.24081
vung3 | -488.0537
97.7098
-4.99
0.000
-679.5871
-296.5203
vung4 |
132.6486
100.3422
1.32
0.186
-64.04499
329.3422
vung5 |
83.13013
94.97234
0.88
0.381
-103.0373
269.2975
vung7 |
1290.005
181.5838
7.10
0.000
934.0593
1645.95
vung8 |
118.2045
100.7079
1.17
0.241
-79.20587
315.6148
hhsize | -80.93744
20.38689
-3.97
0.000
-120.9004
-40.9745
_cons |
-2349.18
283.6993
-8.28
0.000
-2905.295
-1793.065
------------------------------------------------------------------------------

Bnh thng, trong d liu kho st thu nhp chi tiu, i vi file d liu cp
h, bn s dng bin wt9 lm trng s; vi file c nhn (thnh vin), s dng bin
hhszwt lm trng s.(Trong cc lnh sum, tab ... khi mun c lng trung bnh, t
l tng th bn s dng aw thay cho pw)

Nhng option m thng s dng trong lnh reg l


Noconstant : khng c lng hng s trong m hnh
level(#)
: c lng khong tin cy ca h s hi quy tin cy #, mc nh
l level(95)
Beta
: tnh ton thm h s hi quy chun ho

Bn c th s dng tip u ng trong cu lnh?


stepwise [, options ] : command
options
description
-------------------------------------------------------------------------Model
* pr(#)
significance level for removal from the model
* pe(#)
significance level for addition to the model

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

32

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

stepwise, pe(0.2): reg chigd gioi tuoi tuoibp thanhthi vung1 vung2 vung3 vung4 vung5 vung7
vung8 hhsize if m2ac5<=2 [pw= hhszwt], robust
. stepwise, pe(0.2): reg chigd gioi tuoi tuoibp thanhthi vung1 vung2 vung3 vung4 vung5
vung7 vung8
> hhsize if m2ac5<=2 [pw= hhszwt], robust
begin with empty model
p = 0.0000 < 0.2000 adding tuoi
p = 0.0000 < 0.2000 adding vung3
p = 0.0000 < 0.2000 adding thanhthi
p = 0.0000 < 0.2000 adding vung2
p = 0.0000 < 0.2000 adding vung7
p = 0.0000 < 0.2000 adding hhsize
p = 0.0004 < 0.2000 adding tuoibp
p = 0.0032 < 0.2000 adding vung1
Linear regression

Number of obs
F( 8, 9010)
Prob > F
R-squared
Root MSE

=
=
=
=
=

9019
164.56
0.0000
0.2496
2650.9

-----------------------------------------------------------------------------|
Robust
chigd |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------tuoi |
341.5594
29.06481
11.75
0.000
284.5857
398.533
vung3 |
-577.638
80.88375
-7.14
0.000
-736.1886
-419.0875
thanhthi |
1239.731
100.5276
12.33
0.000
1042.675
1436.788
vung2 | -298.6968
60.32118
-4.95
0.000
-416.94
-180.4536
vung7 |
1196.668
168.8873
7.09
0.000
865.6102
1527.725
hhsize | -84.05998
20.19984
-4.16
0.000
-123.6562
-44.4637
tuoibp |
-3.14264
.8913302
-3.53
0.000
-4.88985
-1.39543
vung1 |
186.0523
63.1691
2.95
0.003
62.2265
309.8781
_cons | -2219.553
269.1749
-8.25
0.000
-2747.197
-1691.909
------------------------------------------------------------------------------

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

33

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

Ph lc 2
Bi tp v Qun l d liu mt s lnh hay ca Stata?
Cu 1. reshape wide
Vi

bng

hi: Mc 4B5.1 Thu thu


m4b5(1)) v File d liu: muc4b51.dta

sn

(File

Muc04_1B.xls-sheet

a. bn hy tnh trung bnh s tin thu c t nui trng v nh


bt tm (Ch tnh cho nhng h c nui trng hoc nh bt tm).
b. Theo bn, tnh no c nhiu h nui trng, nh bt tm nht? V
s lng h nui trng, nh bt tm nhiu nht y l bao nhiu?
Gi : dng lnh reshape wide
/* Bai tap quan ly du lieu - nang cao*/
*Cau 1
set mem 300m
use "C:\VHLSS2008\Data\Hhold\muc4b51.dta", clear
count
tab m4b51ma
tab m4b51ma, nol mis
sum m4b51c6b
keep tinh huyen xa diaban hoso m4b51ma m4b51c6b
rename m4b51c6b thuthuysan
tab m4b51ma, mis
recode m4b51ma . =4
reshape wide thuthuysan, i( tinh huyen xa diaban hoso) j( m4b51ma)
count
sum thuthuysan3 thuthuysan11 thuthuysan12 thuthuysan13 thuthuysan14 thuthuysan21
thuthuysan22 thuthuysan23 thuthuysan4
egen thutom=rowtotal( thuthuysan12 thuthuysan22) if ( thuthuysan12!=.)|(
thuthuysan22!=.)
sum thutom
gen cotom=1 if ( thuthuysan12!=.)|( thuthuysan22!=.)
recode cotom . =0
tab tinh cotom
save "C:\VHLSS2008\muc4b51_fileho_thuthuysan.dta", replace

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

34

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

Cu 2. egen
a. T file muc123a.dta bn hy to mt bin cho bit thnh vin
trong h ny c my ngi.
. use "C:\VHLSS2008\Data\Hhold\muc123a.dta", clear
. gen tam=1
. egen songuoi=sum(tam), by ( tinh huyen xa diaban hoso)
. edit
. list tinh huyen xa diaban hoso matv m1ac2 m1ac3 m1ac5 tam songuoi
in f/20

1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.

+-------------------------------------------------------------------------------------+
| tinh
huyen
xa
diaban
hoso
matv
m1ac2
m1ac3
m1ac5
tam
songuoi |
|-------------------------------------------------------------------------------------|
| 101
1
3
1
13
1
N
Ch h
73
1
3 |
| 101
1
3
1
13
2
N
Con
39
1
3 |
| 101
1
3
1
13
3
N
Con
33
1
3 |
| 101
1
3
1
14
1
Nam
Ch h
64
1
3 |
| 101
1
3
1
14
2
N
V chng
53
1
3 |
|-------------------------------------------------------------------------------------|
| 101
1
3
1
14
3
Nam
Con
22
1
3 |
| 101
1
3
1
15
1
Nam
Ch h
61
1
2 |
| 101
1
3
1
15
2
N
V chng
60
1
2 |
| 101
1
9
19
15
1
Nam
Ch h
50
1
2 |
| 101
1
9
19
15
2
N
V chng
41
1
2 |
|-------------------------------------------------------------------------------------|
| 101
1
9
19
19
1
N
Ch h
64
1
3 |
| 101
1
9
19
19
2
Nam
V chng
61
1
3 |
| 101
1
9
19
19
3
N
Con
23
1
3 |
| 101
1
9
19
20
1
Nam
Ch h
50
1
3 |
| 101
1
9
19
20
2
N
V chng
51
1
3 |
|-------------------------------------------------------------------------------------|
| 101
1
9
19
20
3
N
Con
19
1
3 |
| 101
1
15
50
13
1
Nam
Ch h
35
1
4 |
| 101
1
15
50
13
2
N
V chng
34
1
4 |
| 101
1
15
50
13
3
Nam
Con
6
1
4 |
| 101
1
15
50
13
4
Nam
Con
6
1
4 |
+-------------------------------------------------------------------------------------+

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

35

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

b. T file muc123a.dta bn hy to mt bin cho bit hc vn cao


nht ca nhng ngi trong h
. egen

hvmax =max( m2ac1) if

m2ac1!=-1, by( tinh huyen xa diaban hoso)

. list tinh huyen xa diaban hoso matv m1ac3 m1ac5 m2ac1 hvmax in f/20

1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.

+------------------------------------------------------------------------------------------------| tinh
huyen
xa
diaban
hoso
matv
m1ac3
m1ac5
m2ac1
hvmax
|------------------------------------------------------------------------------------------------| 101
1
3
1
13
1
Ch h
73
4
12
| 101
1
3
1
13
2
Con
39
TN THPT
12
| 101
1
3
1
13
3
Con
33
TN THPT
12
| 101
1
3
1
14
1
Ch h
64
TN THPT
12
| 101
1
3
1
14
2
V chng
53
TN THPT
12
|------------------------------------------------------------------------------------------------| 101
1
3
1
14
3
Con
22
TN THPT
12
| 101
1
3
1
15
1
Ch h
61
TN THPT
12
| 101
1
3
1
15
2
V chng
60
TN THPT
12
| 101
1
9
19
15
1
Ch h
50
TN THPT
12
| 101
1
9
19
15
2
V chng
41
TN THPT
12
|------------------------------------------------------------------------------------------------| 101
1
9
19
19
1
Ch h
64
9
12
| 101
1
9
19
19
2
V chng
61
TN THPT
12
| 101
1
9
19
19
3
Con
23
TN THPT
12
| 101
1
9
19
20
1
Ch h
50
TN THPT
12
| 101
1
9
19
20
2
V chng
51
TN THPT
12
|------------------------------------------------------------------------------------------------| 101
1
9
19
20
3
Con
19
TN THPT
12
| 101
1
15
50
13
1
Ch h
35
TN THPT
12
| 101
1
15
50
13
2
V chng
34
TN THPT
12
| 101
1
15
50
13
3
Con
6
Cha ht lp 1/cha i hc
12
| 101
1
15
50
13
4
Con
6
Cha ht lp 1/cha i hc
12
+-------------------------------------------------------------------------------------------------

Cu 3. egen keep/drop v collapse


T file muc123a.dta l file c nhn. Bn hy rt gn file ny cp
h. V c 1 bin cho bit h c my ngi
- Cch 1. Lm nh cu 2. Sau ch gi li ngi no l ch h
- Cch 2.
. gen quymoho=1
. collapse (sum) quymoho, by ( tinh huyen xa diaban hoso)
. list

1.
2.
3.
4.
5.
6.
7.
8.
9.
10.

tinh huyen xa diaban hoso quymoho in f/10

+---------------------------------------------+
| tinh
huyen
xa
diaban
hoso
quymoho |
|---------------------------------------------|
| 101
1
3
1
13
3 |
| 101
1
3
1
14
3 |
| 101
1
3
1
15
2 |
| 101
1
9
19
15
2 |
| 101
1
9
19
19
3 |
|---------------------------------------------|
| 101
1
9
19
20
3 |
| 101
1
15
50
13
4 |
| 101
1
15
50
14
3 |
| 101
1
15
50
15
4 |
| 101
1
17
2
13
2 |
+---------------------------------------------+

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

36

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

Cu 4. collapse
Bn hy dng lnh collapse thu gn li d liu muc123a.dta theo
cp h. Trong file h ny, c bin cho bit tng chi tiu cho gio
dc ca h, c bin m c s ngi hin ang i hc hay ang ngh
ngh ca h, c bin cho bit hc vn cao nht ca ngi trong h,
c bin cho bit tui trung bnh ca cc thnh vin trong h.
. egen chigd=rowtotal( m2ac13k m2ac16)
. list tinh huyen xa diaban hoso matv m2ac5 m2ac13k m2ac16
1/40

1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.

chigd in

+-------------------------------------------------------------------------------+
| tinh
huyen
xa
diaban
hoso
matv
m2ac5
m2ac13k
m2ac16
chigd |
|-------------------------------------------------------------------------------|
| 101
1
3
1
13
1
Khng
.
0
0 |
| 101
1
3
1
13
2
Khng
.
0
0 |
| 101
1
3
1
13
3
Khng
.
0
0 |
| 101
1
3
1
14
1
Khng
.
0
0 |
| 101
1
3
1
14
2
Khng
.
0
0 |
|-------------------------------------------------------------------------------|
| 101
1
3
1
14
3
Ngh h
2300
6000
8300 |
| 101
1
3
1
15
1
Khng
.
0
0 |
| 101
1
3
1
15
2
Khng
.
0
0 |
| 101
1
9
19
15
1
Khng
.
0
0 |
| 101
1
9
19
15
2
Khng
.
0
0 |
|-------------------------------------------------------------------------------|
| 101
1
9
19
19
1
Khng
.
0
0 |
| 101
1
9
19
19
2
Khng
.
0
0 |
| 101
1
9
19
19
3
Khng
.
0
0 |
| 101
1
9
19
20
1
Khng
.
0
0 |
| 101
1
9
19
20
2
Khng
.
0
0 |
|-------------------------------------------------------------------------------|
| 101
1
9
19
20
3
Ngh h
2370
0
2370 |
| 101
1
15
50
13
1
Khng
.
0
0 |
| 101
1
15
50
13
2
Khng
.
0
0 |
| 101
1
15
50
13
3
Ngh h
1900
0
1900 |
| 101
1
15
50
13
4
Ngh h
1900
0
1900 |
|-------------------------------------------------------------------------------|
| 101
1
15
50
14
1
Khng
.
0
0 |
| 101
1
15
50
14
2
Khng
.
0
0 |
| 101
1
15
50
14
3
Khng
.
0
0 |
| 101
1
15
50
15
1
Khng
.
0
0 |
| 101
1
15
50
15
2
Khng
.
0
0 |
|-------------------------------------------------------------------------------|
| 101
1
15
50
15
3
Ngh h
2900
3200
6100 |
| 101
1
15
50
15
4
Ngh h
2050
3000
5050 |
| 101
1
17
2
13
1
Khng
.
0
0 |
| 101
1
17
2
13
2
Khng
.
0
0 |
| 101
1
17
2
14
1
Khng
.
0
0 |
|-------------------------------------------------------------------------------|
| 101
1
17
2
14
2
Khng
.
0
0 |
| 101
1
17
2
14
3
C
3474
0
3474 |
| 101
1
17
2
19
1
Khng
.
0
0 |
| 101
1
17
2
19
2
Khng
.
0
0 |
| 101
1
21
36
13
1
Khng
.
0
0 |
|-------------------------------------------------------------------------------|
| 101
1
21
36
13
2
Khng
.
0
0 |
| 101
1
21
36
13
3
C
3000
0
3000 |
| 101
1
21
36
15
1
Khng
.
0
0 |
| 101
1
21
36
15
2
Khng
.
0
0 |
| 101
1
21
36
15
3
C
3050
0
3050 |
+-------------------------------------------------------------------------------+

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

37

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

. collapse (mean) m1ac5 (sum) dihoc_nghihe (sum) chigd (max) m2ac1, by (


tinh huyen xa diaban hoso)
. list tinh huyen xa diaban hoso m1ac5 dihoc_nghihe chigd m2ac1 in 1/20

1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.

+------------------------------------------------------------------------+
| tinh
huyen
xa
diaban
hoso
m1ac5
dihoc_~e
chigd
m2ac1 |
|------------------------------------------------------------------------|
| 101
1
3
1
13
48.3333
0
0
12 |
| 101
1
3
1
14
46.3333
1
8300
12 |
| 101
1
3
1
15
60.5
0
0
12 |
| 101
1
9
19
15
45.5
0
0
12 |
| 101
1
9
19
19
49.3333
0
0
12 |
|------------------------------------------------------------------------|
| 101
1
9
19
20
40
1
2370
12 |
| 101
1
15
50
13
20.25
2
3800
12 |
| 101
1
15
50
14
18.3333
0
0
12 |
| 101
1
15
50
15
35
2
11150
12 |
| 101
1
17
2
13
33.5
0
0
12 |
|------------------------------------------------------------------------|
| 101
1
17
2
14
43.3333
1
3474
12 |
| 101
1
17
2
19
47
0
0
12 |
| 101
1
21
36
13
27.3333
1
3000
11 |
| 101
1
21
36
15
43.3333
1
3050
12 |
| 101
1
21
36
19
39.2
1
8220
12 |
|------------------------------------------------------------------------|
| 101
1
23
18
13
49.5
0
0
12 |
| 101
1
23
18
24
29.5
0
0
12 |
| 101
1
23
18
29
20
0
0
12 |
| 101
3
3
23
13
28.75
2
8350
12 |
| 101
3
3
23
15
24.25
1
4950
12 |
+------------------------------------------------------------------------+

Cu 5. append using
Bn hy to mt d liu gp file hhexpe06.dta v file hhexpe08.dta
. use "C:\VHLSS2008\Data\Hhold\hhexpe08.dta", clear
(Household expenditures: 2008 VHLSS)
. keep tinh huyen xa diaban hoso hhsize

hhexp1rl wt9 urban08 reg8

. save "C:\VHLSS2008\hhexpe08_sua.dta"
file C:\VHLSS2008\hhexpe08_sua.dta saved
. gen nam=2008
. count
9189
. save "C:\VHLSS2008\hhexpe08_sua.dta", replace
file C:\VHLSS2008\hhexpe08_sua.dta saved
. use "C:\VHLSS2006\VHLSS2006\Data\hhold\hhexpe06.dta", clear
(Household expenditures: 2006 VHLSS)
. keep tinh huyen xa diaban hoso reg8 urban06 hhsize wt9 hhexp1rl
. sort tinh huyen xa diaban hoso
. gen nam=2006
. save "C:\VHLSS2008\hhexpe06_sua.dta"
. count
9189

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

38

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

. save "C:\VHLSS2008\hhexpe06_sua.dta", replace


file C:\VHLSS2008\hhexpe06_sua.dta saved
. use "C:\VHLSS2008\hhexpe08_sua.dta", clear
(Household expenditures: 2008 VHLSS)
. append using "C:\VHLSS2008\hhexpe06_sua.dta"
. count
18378
. . list tinh huyen xa diaban hoso reg8 hhsize hhexp1rl nam urban08 urban06 in
9180/9210
. list tinh huyen xa diaban hoso reg8 hhsize hhexp1rl nam urban08 urban06 in
9180/9210, nol

9180.
9181.
9182.
9183.
9184.
9185.
9186.
9187.
9188.
9189.
9190.
9191.
9192.
9193.
9194.
9195.
9196.
9197.
9198.
9199.
9200.
9201.
9202.
9203.
9204.
9205.
9206.
9207.
9208.
9209.
9210.

+-----------------------------------------------------------------------------------------+
| tinh
huyen
xa
diaban
hoso
reg8
hhsize
hhexp1rl
nam
urban08
urban06 |
|-----------------------------------------------------------------------------------------|
| 823
13
6
29
19
8
4
37240.17
2008
2
. |
| 823
13
11
4
13
8
3
47041.98
2008
2
. |
| 823
13
11
4
14
8
4
26782.1
2008
2
. |
| 823
13
11
4
15
8
6
50781.51
2008
2
. |
| 823
13
12
25
13
8
5
13658.05
2008
2
. |
|-----------------------------------------------------------------------------------------|
| 823
13
12
25
15
8
6
17925.5
2008
2
. |
| 823
13
12
25
19
8
4
19404.83
2008
2
. |
| 823
13
17
1
13
8
3
8687.102
2008
2
. |
| 823
13
17
1
14
8
5
18933.47
2008
2
. |
| 823
13
17
1
20
8
4
19210.81
2008
2
. |
|-----------------------------------------------------------------------------------------|
| 101
1
3
14
15
1
4
50885.41
2006
.
1 |
| 101
1
3
14
19
1
4
70989.08
2006
.
1 |
| 101
1
3
14
24
1
4
47160.71
2006
.
1 |
| 101
1
9
19
13
1
2
22773.67
2006
.
1 |
| 101
1
9
19
15
1
3
34226.87
2006
.
1 |
|-----------------------------------------------------------------------------------------|
| 101
1
9
19
19
1
3
32712.68
2006
.
1 |
| 101
1
15
27
13
1
3
15950.77
2006
.
1 |
| 101
1
15
27
14
1
1
72100.53
2006
.
1 |
| 101
1
15
27
15
1
4
30387.58
2006
.
1 |
| 101
1
17
2
13
1
3
23695.05
2006
.
1 |
|-----------------------------------------------------------------------------------------|
| 101
1
17
2
14
1
3
34148.03
2006
.
1 |
| 101
1
17
2
19
1
3
26933.93
2006
.
1 |
| 101
1
21
24
13
1
4
117633.9
2006
.
1 |
| 101
1
21
24
14
1
2
23580.65
2006
.
1 |
| 101
1
21
24
19
1
4
82198.54
2006
.
1 |
|-----------------------------------------------------------------------------------------|
| 101
1
23
18
13
1
4
52940.96
2006
.
1 |
| 101
1
23
18
24
1
4
45195.64
2006
.
1 |
| 101
1
23
18
25
1
3
39613.94
2006
.
1 |
| 101
3
3
14
13
1
4
85210.34
2006
.
1 |
| 101
3
3
14
14
1
5
41575.87
2006
.
1 |
|-----------------------------------------------------------------------------------------|
| 101
3
3
14
15
1
3
29575.11
2006
.
1 |
+-----------------------------------------------------------------------------------------+

Cu 6. Lnh Merge
Bn tng Merge 2
h. By gi bn th
cp h th s ra
x/phng s c cng

file cp h h, c nhn c nhn, c nhn


suy ngh xem ghp 1 file cp x/phng vo file
sao? Trong nhiu bin, nhng h cng mt
nhng thng tin ca x/phng phi khng?

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

39

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

Ph lc 3. M hnh Logit
1. D liu
D liu c trch t VHLSS2006 v cc thnh vin trong h t 15 tui tr ln. Trong d liu c mt
s bin sau:
Tn bin
covieclam
thanhthi

Gii thch
C vic lm trong 12
thng qua
Thnh th/nng thn

Daihoc, caodang, thpt, thcs,


tieuhoc

Cc bin gi phn nh
bng cp cao nht
t c

kinhhoa

Dn tc kinh, hoac

quymoho
Tuoi
thunhapbq
Kvlv5

Quy m h
Tui
Thu nhp bnh
qun/ngi/h/thng
Khu vc lm vic 5

Kvlv3

Khu vc lm vic 3

Vung1, Vung2, Vung8

Cc bin gi m ho
cc vng trong c
nc
Cc bin gi phn th
hin min Bc, min
Trung&ty nguyn,
min nam
Trng s c nhn

bac, trung, nam

hhszwt

. count
29360

Ghi ch
1: C
0: Khng
1: Thnh th
0: Nng thn
- Cc bin ny c m ho li t
bin m2ac3a
- Bin gi tham chiu l khng c
bng cp
1: Dn tc Kinh, hoc Hoa
0: Dn tc khc
S ngi trong h (ngi)
Ngn
1: T lm cho gia nh
2: Lm cho h khc
3: Kinh t nh nc tp th
4: Kinh t t nhn
5: Kinh t c vn u t nc
ngoi
1: T lm cho gia nh, lm cho
h khc, hoc kinh t t nhn
2: Kinh t nh nc tp th
3: Kinh t c vn u t nc
ngoi
Cc bin ny c m ho li t
bin Reg8
Cc bin ny c m ho li t
bin Reg8

/* Tng s quan st*/

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

40

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

M t mt s bin
. tab thanhthi
thanh thi: |
1 thanh |
thi, 0 nong |
thon |
Freq.
Percent
Cum.
------------+----------------------------------0 |
21,846
74.41
74.41
1 |
7,514
25.59
100.00
------------+----------------------------------Total |
29,360
100.00
. tab m2ac3a
3.Bng cp |
cao nht - |
GDPT |
Freq.
Percent
Cum.
------------+----------------------------------K0 bng cp |
188
0.83
0.83
Tiu hc |
7,266
32.20
33.03
THCS |
8,983
39.81
72.84
THPT |
4,954
21.95
94.80
Cao ng |
345
1.53
96.33
i hc |
801
3.55
99.88
Thc s |
22
0.10
99.97
Tin s |
6
0.03
100.00
------------+----------------------------------Total |
22,565
100.00
. tab kinhhoa
kinhhoa |
Freq.
Percent
Cum.
------------+----------------------------------0 |
4,798
16.34
16.34
1 |
24,562
83.66
100.00
------------+----------------------------------Total |
29,360
100.00
. tab vung
vung |
Freq.
Percent
Cum.
------------+----------------------------------1 |
5,866
19.98
19.98
2 |
4,300
14.65
34.63
3 |
1,468
5.00
39.63
4 |
3,143
10.71
50.33
5 |
2,734
9.31
59.64
6 |
1,869
6.37
66.01
7 |
3,912
13.32
79.33
8 |
6,068
20.67
100.00
------------+----------------------------------Total |
29,360
100.00
. tab gioi
gioi 1.nam |
0 nu |
Freq.
Percent
Cum.
------------+----------------------------------0 |
15,121
51.50
51.50
1 |
14,239
48.50
100.00
------------+----------------------------------Total |
29,360
100.00

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

41

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

. tab thanhthi
thanh thi: |
1 thanh |
thi, 0 nong |
thon |
Freq.
Percent
Cum.
------------+----------------------------------0 |
21,846
74.41
74.41
1 |
7,514
25.59
100.00
------------+----------------------------------Total |
29,360
100.00
. mean tuoi quymoho thunhapbq
Mean estimation

Number of obs

29360

-------------------------------------------------------------|
Mean
Std. Err.
[95% Conf. Interval]
-------------+-----------------------------------------------tuoi |
38.50232
.1031738
38.30009
38.70454
quymoho |
4.798774
.0105579
4.77808
4.819468
thunhapbq |
693.8959
4.09364
685.8722
701.9196
-------------------------------------------------------------. tab kvlv5
kvlv5 |
Freq.
Percent
Cum.
------------+----------------------------------1 |
15,350
68.63
68.63
2 |
3,280
14.67
83.30
3 |
2,390
10.69
93.99
4 |
978
4.37
98.36
5 |
367
1.64
100.00
------------+----------------------------------Total |
22,365
100.00
. tab kvlv3
kvlv3 |
Freq.
Percent
Cum.
------------+----------------------------------1 |
19,608
87.67
87.67
2 |
2,390
10.69
98.36
3 |
367
1.64
100.00
------------+----------------------------------Total |
22,365
100.00

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

42

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

2. c lng mt s m hnh Logit


M hnh 1. Cc yu t nh hng n xc sut c vic lm
a. M hnh khng tnh n trng s
. logit covieclam gioi tuoi kinhhoa daihoc caodang thpt thcs tieuhoc if thanhthi==1
Iteration
Iteration
Iteration
Iteration
Iteration

0:
1:
2:
3:
4:

log
log
log
log
log

likelihood
likelihood
likelihood
likelihood
likelihood

=
=
=
=
=

-4681.615
-4501.1785
-4496.9115
-4496.8878
-4496.8878

Logistic regression

Number of obs
LR chi2(8)
Prob > chi2
Pseudo R2

Log likelihood = -4496.8878

=
=
=
=

7514
369.45
0.0000
0.0395

-----------------------------------------------------------------------------covieclam |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------gioi |
.2749431
.0516756
5.32
0.000
.1736608
.3762254
tuoi |
.0016395
.0015832
1.04
0.300
-.0014636
.0047426
kinhhoa | -.4514802
.1271611
-3.55
0.000
-.7007114
-.202249
daihoc |
1.647956
.1372616
12.01
0.000
1.378929
1.916984
caodang |
1.613607
.223183
7.23
0.000
1.176177
2.051038
thpt |
.3663942
.0857136
4.27
0.000
.1983986
.5343898
thcs |
.3866646
.0846816
4.57
0.000
.2206916
.5526376
tieuhoc |
1.064819
.0914975
11.64
0.000
.8854877
1.244151
_cons |
.4565806
.1548386
2.95
0.003
.1531025
.7600586
------------------------------------------------------------------------------

b. M hnh c tnh n trng s


. logit covieclam gioi tuoi kinhhoa daihoc caodang thpt thcs tieuhoc
> [pw=hhszwt]
(sum of wgt is
Iteration 0:
Iteration 1:
Iteration 2:
Iteration 3:
Iteration 4:

8.1676e+07)
log pseudolikelihood
log pseudolikelihood
log pseudolikelihood
log pseudolikelihood
log pseudolikelihood

= -4726.9682
= -4496.5079
= -4490.0821
= -4490.022
= -4490.022

Logistic regression
Log pseudolikelihood =

if thanhthi==1

-4490.022

Number of obs
Wald chi2(8)
Prob > chi2
Pseudo R2

=
=
=
=

7514
325.64
0.0000
0.0501

-----------------------------------------------------------------------------|
Robust
covieclam |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------gioi |
.3760212
.0597878
6.29
0.000
.2588393
.493203
tuoi | -.0026357
.0020915
-1.26
0.208
-.006735
.0014636
kinhhoa | -.5075088
.1535001
-3.31
0.001
-.8083635
-.206654
daihoc |
1.800663
.1590845
11.32
0.000
1.488863
2.112463
caodang |
1.869919
.2621885
7.13
0.000
1.356039
2.383799
thpt |
.35563
.0987501
3.60
0.000
.1620834
.5491766
thcs |
.3508384
.0968012
3.62
0.000
.1611115
.5405653
tieuhoc |
1.090695
.1079582
10.10
0.000
.8791004
1.302289
_cons |
.5925872
.1868671
3.17
0.002
.2263344
.95884
------------------------------------------------------------------------------

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

43

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

M hnh 2. Cc yu t nh hng n xc sut c vic lm


a. M hnh khng tnh n trng s
. logit covieclam gioi thanhthi tuoi tuoibp kinhhoa daihoc caodang thpt thcs tieuhoc
> kinhhoa
note: kinhhoa dropped because of collinearity
Iteration 0: log likelihood = -16120.212
Iteration 1: log likelihood = -11270.986
Iteration 2: log likelihood = -10853.121
Iteration 3: log likelihood = -10829.292
Iteration 4: log likelihood = -10829.156
Iteration 5: log likelihood = -10829.156
Logistic regression

Number of obs
LR chi2(10)
Prob > chi2
Pseudo R2

Log likelihood = -10829.156

=
=
=
=

29360
10582.11
0.0000
0.3282

-----------------------------------------------------------------------------covieclam |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------gioi |
.4357346
.0350008
12.45
0.000
.3671343
.504335
thanhthi | -.8299765
.039965
-20.77
0.000
-.9083065
-.7516464
tuoi |
.4144831
.005585
74.21
0.000
.4035367
.4254296
tuoibp | -.0047818
.0000652
-73.29
0.000
-.0049097
-.0046539
kinhhoa | -.5583331
.0525236
-10.63
0.000
-.6612775
-.4553888
daihoc |
.2838882
.1264527
2.25
0.025
.0360455
.5317309
caodang |
.4035841
.187377
2.15
0.031
.036332
.7708362
thpt | -.7995127
.0608258
-13.14
0.000
-.9187292
-.6802963
thcs | -.3051377
.0561067
-5.44
0.000
-.4151048
-.1951706
tieuhoc |
.4442288
.0599536
7.41
0.000
.3267219
.5617357
_cons | -5.143131
.1065803
-48.26
0.000
-5.352025
-4.934238

b. M hnh c tnh n trng s


. logit covieclam gioi thanhthi tuoi tuoibp kinhhoa daihoc caodang thpt thcs tieuhoc
> kinhhoa [pw=hhszwt]
(sum of wgt is
2.9721e+08)
note: kinhhoa dropped because of collinearity
Iteration 0:
log pseudolikelihood = -16413.351
Iteration 1:
log pseudolikelihood = -11501.346
Iteration 2:
log pseudolikelihood = -11068.455
Iteration 3:
log pseudolikelihood = -11042.538
Iteration 4:
log pseudolikelihood = -11042.382
Logistic regression
Log pseudolikelihood = -11042.382

Number of obs
Wald chi2(10)
Prob > chi2
Pseudo R2

=
=
=
=

29360
3963.25
0.0000
0.3272

-----------------------------------------------------------------------------|
Robust
covieclam |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------gioi |
.4702713
.0386441
12.17
0.000
.3945302
.5460124
thanhthi | -.8711648
.0454963
-19.15
0.000
-.960336
-.7819937
tuoi |
.422557
.0072647
58.17
0.000
.4083184
.4367956
tuoibp | -.0049484
.000089
-55.58
0.000
-.0051229
-.0047739
kinhhoa | -.5832457
.0610073
-9.56
0.000
-.7028177
-.4636737
daihoc |
.5034056
.1468915
3.43
0.001
.2155035
.7913077
caodang |
.6192097
.2236963
2.77
0.006
.1807731
1.057646
thpt | -.6663316
.073634
-9.05
0.000
-.8106516
-.5220116
thcs | -.2068623
.0676326
-3.06
0.002
-.3394197
-.0743049
tieuhoc |
.5462198
.0774623
7.05
0.000
.3943964
.6980432
_cons | -5.298014
.1266096
-41.85
0.000
-5.546164
-5.049864
------------------------------------------------------------------------------

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

44

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

Ph lc 4. Cu trc lnh trong Stata, vn trng s trong phn tch d liu VHLSS 1
Hnh 4.1

[lnh prefix: ] c php lnh [danh sch bin] [biu thc] [iu kin] [phm vi] [trng s] [ using tn file] [,tu chn]

Trong cu trc lnh, nu mc no t trong 2 du ngoc vung [] tc l khng bt buc phI c mc


ny
C nhng hng dn, c mc t trong du 2 du ngoc nhn <>, mc ny bt buc phi c khi g
lnh.

Prefix:

Mt lnh prefix m bn bit n v thng s dng l by. Bn cn nh khng?

Command: g lnh m bn cn thc hin. Mt s lnh stata cho php vit tt. V d, lnh sum
m bn s dng l vit tt ca lnh summarize. Bn cng c th g tt lnh ny bng ch su

=exp (biu thc)


V d, bn cn to bin tuoi, v tuoibp. Bit rng gen l lnh to mt bin mi
. gen tuoi= m1ac5
. gen tuoibp= m1ac5^2

Nhng kt qu ph lc 4 c tnh ton trn VHLSS 2006

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

45

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

Varlist (danh sch bin): ch ra danh sch bin chu tc ng ca cu lnh. Nhng nu khng
c bin no c ch ra th lnh Stata s c tc ng ln tt c cc bin.

If (iu kin)

Stata ch thc hin cu lnh i vi cc quan st m c kt qu ca biu thc so snh trong iu kin if
l ng.
V d: m s ngi TPHCM; m s ngi Nng v TPHCM v
. count if tinh==701
1257
. count if tinh==501 | tinh==701
1759
V d: To bng tn s cho bin loi trng hc
. tab m2ac4 if urban06==1
4.Loi |
trng |
TN |
Freq.
Percent
Cum.
------------+----------------------------------Cng lp |
6,919
95.49
95.49
Bn cng |
187
2.58
98.07
Dn lp |
76
1.05
99.12
T thc |
52
0.72
99.83
Khc |
12
0.17
100.00
------------+----------------------------------Total |
7,246
100.00

- Ch rng khi so snh bng, chng ta s dng 2 du =, tc l == (sau lnh if). Cn mc trn, khi
to bin tui, trong php gn, chng ta g gen tuoi= m1ac5

using filename

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

46

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

Trong lnh merge, bn tng s dng mc [using filename]

In range (phm vi)

Ch ra phm vi cc quan st chu tc ng bi cu lnh 2


. tab m4ac1a
1A. Lm |
nhn lng |
cng |
Freq.
Percent
Cum.
------------+----------------------------------C |
9,447
26.11
26.11
Khng |
26,728
73.89
100.00
------------+----------------------------------Total |
36,175
100.00

. tab m4ac1a in 100

/*to bng tn s cho bin m4ac1a cho quan st th 100, chnh bng ga
tr ca bin ny ti quan st th 100*/

1A. Lm |
nhn lng |
cng |
Freq.
Percent
Cum.
------------+----------------------------------Khng |
1
100.00
100.00
------------+----------------------------------Total |
1
100.00
. tab m4ac1a in 100/1000

/*to bng tn s cho bin m4ac1a cho cc quan st t th 100 n 1000 */

1A. Lm |
nhn lng |
cng |
Freq.
Percent
Cum.
------------+----------------------------------C |
281
34.10
34.10
Khng |
543
65.90
100.00
------------+----------------------------------Total |
824
100.00
. tab m4ac1a in f/100

/*to bng tn s cho bin m4ac1a cho cc quan st t th 1 n 100 */

1A. Lm |
nhn lng |
cng |
Freq.
Percent
Cum.
------------+----------------------------------C |
42
45.16
45.16
Khng |
51
54.84
100.00
------------+----------------------------------Total |
93
100.00
. tab m4ac1a in 100/l

/*to bng tn s cho bin m4ac1a cho cc quan st t th


100 n quan st cui cng */

1A. Lm |
nhn lng |
cng |
Freq.
Percent
Cum.
------------+----------------------------------C |
9,405
26.06
26.06
Khng |
26,678
73.94
100.00
------------+----------------------------------Total |
36,083
100.00
2

Xem thm Ph lc 2

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

47

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

Weight (trng s)

Cho php cc php phn tch c s dng n trng s (hay quyn s)


Khi phn tch VHLSS cn s dng trng s nu bn mun c lng cc tham s thng k cho tng
th. Trong VHLSS2006, v tng t VHLSS2008 c 2 bin lu trng s.
wt9: trng s h (khi s dng d liu mu kho st thu nhp v chi tiu vi c mu 9189 h)
hhszwt: trng s c nhn
Hai bin trn c quan h nh sau: hhszwt=hhsize*wt9
Vi hhsize l tng s ngi trong h
Hnh 4.2

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

48

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

V d: Bin reg8 lu tr thng tin v vng. C 8 vng trong c nc. Theo bn lm sao bit c
reg8=1 l tng ng vi Vng no? (Hy xem trong sheet tinh ca file excel Muc1.xls)
tab reg8
reg8 |
Freq.
Percent
Cum.
------------+----------------------------------1 |
7,433
19.02
19.02
2 |
5,698
14.58
33.61
3 |
2,163
5.54
39.14
4 |
4,337
11.10
50.24
5 |
3,634
9.30
59.55
6 |
2,848
7.29
66.83
7 |
5,134
13.14
79.97
8 |
7,824
20.03
100.00
------------+----------------------------------Total |
39,071
100.00
tab reg8 [aw= hhszwt]
reg8 |
Freq.
Percent
Cum.
------------+----------------------------------1 |7,544.17409
19.31
19.31
2 | 4,442.9992
11.37
30.68
3 |1,437.84886
3.68
34.36
4 | 5,212.3526
13.34
47.70
5 | 3,285.1965
8.41
56.11
6 | 2,708.8742
6.93
63.04
7 | 6,573.3225
16.82
79.87
8 | 7,866.2321
20.13
100.00
------------+----------------------------------. sum m4ac11
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------m4ac11 |
7091
11426.16
10756.02
180
480000
. sum m4ac11 [aw=hhszwt]
Variable |
Obs
Weight
Mean
Std. Dev.
Min
Max
-------------+----------------------------------------------------------------m4ac11 |
7091 75166467.1
11798.71
10781.87
180
480000
Khi phn tch VHLSS, vi lnh hi quy, bn dng ch pw thay cho ch aw khai bo
trng s

options (Cc tu chn)

Nhiu cu lnh trong STATA cho php c cc tu chn ring, cc tu chn ny ch


c ch c ch ra sau du phy (du ,).
V d: tu chn detail ca lnh sum

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

49

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

sum m4ac11 [aw=hhszwt], detail


11. Tin lng, tin cng
------------------------------------------------------------Percentiles
Smallest
1%
900
180
5%
2000
210
10%
3000
210
Obs
7091
25%
5500
225
Sum of Wgt. 75166467.1
50%
75%
90%
95%
99%

9600
15000
22583
30000
44400

Largest
120000
150000
156000
480000

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Mean
Std. Dev.

11798.71
10781.87

Variance
Skewness
Kurtosis

1.16e+08
12.99603
498.5566

Ghi ch bi ging

50

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

Ph lc 5. Kiu d liu, mt s lnh, hm ton hc, ton t thng dng


Hm ton hc (Mathematic Functions)
Cu lnh

Din gii

abs(x)

Gi tr tuyt i (Absolute value)

sin(x), cos(x), tan(x)

Sin, cos, tg

int(x), round(x)

Ly s nguyn/lm trn s

exp(x)

Hm m Exponential function

ln(x)

Logarit t nhin (Natural logarithm)

logit(x), invlogit(x)

Log ca t l odd v nghch o ca n

max(x), min(x)

GT ln nht v nh nht

sqrt(x)

Cn bc (Square root)

sum(x)

Tng cng

Cc lnh thng dng v qun l d liu (Data Management)


Ch :

Trong biu thc du == c dng cho vic kim nh biu thc so


snh, thng c dng sau lnh if. Cn du = c dng cho php
gn, v d trong lnh to bin mi

des, save, edit

M t bin, Lu tr, chnh sa d liu

gen, xtile, replace,

To bin mi, to bin phn nhm cho mt bin no theo

recode

phn v, thay th gi tr, m ho li bin

keep, drop

Gi li/ xo bin hay cc quan st

label, format

To nhn cho bin, to nh dng d liu ca bin

append, merge

Ni cc quan st, ni cc bin t nhng file khc nhau

rename

i tn bin

sort, order, move

Sp xp cc quan st theo th t, sp xp bin, di chuyn bin

egen; collapse

To bin mi; thu gn d liu

Phn tch hi quy (Regression Analysis)


Cu lnh

Din gii

correlate, regress

Tng quan, hi quy vi OLS

Logit, Mlogit

M hnh Binary logistic (m hnh logit), m hnh Mlogit

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

51

Chng trnh ging dy Kinh t Fulbright

Lp MPP3 hc k Thu 2010

Cc ton t trong Stata


K hiu

ngha

S hc
+

Cng

Tr

Nhn

Chia

Lu tha

>

Ln hn

<

Nh hn

>=

Ln hn hoc bng

<=

Nh hn hoc bng

==

Bng

~=

Khng bng (khc)

!=

Khng bng (khc)

Khng

Hoc

&

Quan h

Lgc

Ch :
Trong biu thc du == c dng cho vic kim nh biu thc so snh, thng c dng sau lnh if. Cn du
= c dng cho php gn, v d trong lnh to bin mi

Kiu d liu (Data Types)


Dng

Hnh thc

Din gii

float

S thc

-1.7x1038 n 1.7x1036

double

S thc

-8.9x10307 n 8.9x10307

byte

S nguyn

-127 ~ 100

int

S nguyn

-32767 ~ 32740

long

S nguyn

-2,147,483,647 ~ 2,147,483,620

str#

Chui (dng text)

str1 n str244

Nguyn Khnh Duy, email: khanhduy@ueh.edu.vn

Ghi ch bi ging

52

You might also like