Professional Documents
Culture Documents
Nhp v x l d liu
Mn hc: Phng php nghin cu kinh t Khoa Kinh t Pht trin i hc Kinh T TP. H Ch Minh
Bo co nghin cu
Ra quyt nh
10
Xc nh thang o ca bin
11
12
13
14
70
40
15
16
Percent
10.0
8.0
Cumulative Percent
10.0
18.0
30 Others Honda AirBlade 10.0% 10.0% Honda @ 20 Honda Future Neo 7.0% 8.0% Honda Dream Yamaha Sirius 6.0% 7.0% 10
10
8
Yamaha Sirius
Yamaha Jupiter
7
13
7.0
13.0
7.0
13.0
25.0
38.0
Honda Wave
Yamaha Cygnus
24
4
24.0
4.0
24.0
4.0
62.0
66.0
SYM Attila 11.0% Yamaha Cygnus 4.0% Honda Wave 24.0% Yamaha Jupiter 13.0% 0
SYM Attila
Honda Dream Honda @ Others Total
11
6 7 10 100
11.0
6.0 7.0 10.0 100.0
11.0
6.0 7.0 10.0 100.0
77.0
83.0 90.0 100.0
17
Motobike Names
18
Biu histogram l mt gii php quy c dng th hin cc d liu t l hoc khong cch. Biu histogram c s dng phn nhm cc gi tr d liu ca cc bin s (variable) thnh cc khong cch. Biu histogram c xy dng di dng cc thanh th hin gi tr d liu.
19
Biu histogram rt hu dng cho vic: (1) th hin tt c cc khong cch trong mt phn phi (distribution), v (2) trc nghim dng hnh ca phn phi nh mo (skewness), nhn (kurtosis).
Ghi ch: Biu histogram khng dng c cho cc bin danh ngha.
20
20
Mi dng ca biu c gi l mt thn; v mi s liu th hin trn mt thn gi l mt l. Khi biu thn-v-l c quay tri 900 , n s c dng hnh tng t nh biu histogram.
10
21
22
Biu hp, hay cn gi l biu hp-v-ru (boxand-whisker plot), cho ta mt hnh nh trc quan khc v v tr, phn tn, dng hnh, di ui v cc gi tr bt thng (outliers) ca phn phi. Biu hp th hin tm tt 5 gi tr thng k ca mt phn phi l trung v (median), hai t phn v trn v di (the upper and lower quartiles), v cc gi tr quan st ln nht v nh nht
6
23 24
10 1 case(s)
Cc thnh phn ch yu ca biu hp l: Hp hnh ch nht cha ng 50% cc gi tr d liu. ng thng trung tm hp l gi tr trung v. Hai l ca hp th hin hai gi tr t phn v th 1 v th 3 (tng ng vi gi tr th 25% (25th percentile) v gi tr th 75% (75th percentile) ca dy s liu. Cc ru ko di t l pha trn v pha di ca hp th hin gi tr ln nht v nh nht. Cc gi tr ny nm trong khong ti a 1,5 ln khong cch gia cc t phn v tnh t l ca hp.
25
Trung v (MEDIAN)
26
80
60
S dng Excel: cng c Descriptives Statistics trong chc nng Data Analysis. S dng SPSS: cng c Frequency, Descriptives, Explore trong chc nng Descriptive Statistics ca SPSS.
40
5.4 Biu hp ca bin s Tui ca ngi s dng xe my v s ngy s dng trong thng
20
0
N= 100 100
27
28
Cc ch tiu thng k m t : xu hng trung tm, tnh bin thin v dng hnh phn phi ca d liu.
o lng tnh bin thin (Measures of Variability) Phng sai (Variance; 2) l trung bnh tng cc sai s bnh phng gia cc gi tr ca cc quan st v gi tr trung bnh. lch chun (Standard deviation; SD; ) o lng mc phn tn ca s liu xung quanh gi tr trung bnh. Sai s chun ca gi tr trung bnh (Standard error of the mean; s.e.) o lng phm vi m gi tr trung bnh ca qun th () c th xut hin vi mt xc sut cho trc da trn gi tr trung bnh ca mu (mean).
31
Hnh 5.11 Cc dng phn phi lch tri v lch phi so vi phn phi bnh thng
34
nhn (kurtosis) o lng mc nhn hay bt ca phn phi so vi phn phi bnh thng (c nhn bng 0). Phn phi c dng nhn khi gi tr kurtosis dng v c dng bt khi gi tr kurtosis m. Vi phn phi bnh thng, gi tr ca mo v nhn bng 0. Cn c trn t s gia gi tr skewness v kurtosis v sai s chun ca n, ta c th nh gi phn phi c bnh thng hay khng (khi t s ny nh hn 2 v ln hn +2, phn phi l khng bnh thng).
35
36
Minimum
Maximum
18
76
Mean
Std. Deviation
39.01
14.42
1.44
Variance
Skewness
207.909
.242 .241
Kurtosis
-.948
.478
37 38
5.7 Thng k m t cc bin s Tui ca ngi s dng xe my v s ngy s dng trong thng phn theo gii tnh
Age of motorbike user Number of used days in a month
5.7 Thng k m t cc bin s Tui ca ngi s dng xe my v s ngy s dng trong thng phn theo gii tnh
male Mean 95% Confidence Interval for Mean Lower Bound Upper Bound 39.39 35.45 43.33 1.97 19.76 17.74 21.79 1.01
User gender
female Mean 95% Confidence Interval for Mean Lower Bound
Statistic
38.46 34.19
Std. Error
2.11
Statistic
20.71 18.54
Std. Error
1.07
5% Trimmed Mean
Median Variance
38.87
42.00 228.173
19.90
21.00 60.460
Upper Bound
5% Trimmed Mean Median Variance
42.74
38.13 41.00 183.205
22.88
20.95 22.00 47.212
Std. Deviation
Minimum Maximum
15.11
18 76
7.78
5 32
Std. Deviation
Minimum
13.54
19
6.87
7
Range
Interquartile Range Skewness
58
28.00 .292 .311
27
15.00 -.175 .311
Maximum
Range Interquartile Range Skewness Kurtosis
65
46 23.00 .118 -1.089 .369 .724
30
23 11.00 -.513 -.838 .369
39 .724
Kurtosis
-.932
.613
-1.271
.613
40
41
42
Motobike Names
Honda AirBlade Honda Future Neo Yamaha Sirius Yamaha Jupiter Honda Wave Yamaha Cy gnus SYM Attila Honda Dream Honda @ Others
3 1 1 1
1 3
4.2% 27.3%
57.1% 1
43
44
45
46
Honda AirBlade Honda Fut ure Neo Yamaha Sirius Yamaha Jupiter Honda Wav e Yamaha Cy gnus SYM Att ila Honda D ream Honda @ Others
Tot al
Honda Honda AirBlade Future Neo User gender f emale Count 3 4 Expected Count 4.1 3.3 % within User gender 7.3% 9.8% % within Motobike Names 30.0% 50.0% % of Total 3.0% 4.0% male Count 7 4 Expected Count 5.9 4.7 % within User gender 11.9% 6.8% % within Motobike Names 70.0% 50.0% % of Total 7.0% 4.0% Total Count 10 8 Expected Count 10.0 8.0 % within User gender 10.0% 8.0% % within Motobike Names 100.0% 100.0% % of Total 10.0% 8.0%
Yamaha Sirius 3 2.9 7.3% 42.9% 3.0% 4 4.1 6.8% 57.1% 4.0% 7 7.0 7.0% 100.0% 7.0%
Others 4 4.1 9.8% 40.0% 4.0% 6 5.9 10.2% 60.0% 6.0% 10 10.0 10.0% 100.0% 10.0%
Total 41 41.0 100.0% 41.0% 41.0% 59 59.0 100.0% 59.0% 59.0% 100 100.0 100.0% 100.0% 100.0%
47
48
Mc tiu c th
So snh nhm
Mc lin quan, cc bin lin quan Lin quan Thng k lin quan (v.d. tng quan, hi quy)
Tm lc d liu
M t
Kiu thng k
H0: nam = n
H0: nam n
H0: GM = 0
H0: GM 0
51
52
Hu ht cc phn mm thng k u cho kt qu vi gi tr xc sut (p values). Gi tr xc sut p value l xc sut t c mt kt qu, t nht cao bng, hoc cao hn gi tr c quan st trong thc t, vi iu kin cho trc l gi thit H0 l ng.
53 54
Gi tr p value c so snh vi mc ngha (significant level - ), v da trn kt qu ny bc b hay khng bc b gi thit. Nu gi tr p value nh hn mc ngha, gi thit b bc b (p value < , bc b gi thit H0). Nu gi tr p value bng hoc ln hn mc ngha, khng bc b gi thit (p value > , khng bc b gi thit H0).
55
C hai loi: parametric (tham s) v nonparametric (phi tham s). Parametric tests l cng c mnh v x l cc d liu dng scale (interval, ratio). Nonparametric tests l cng c x l cc d liu dng nominal v ordinal.
56
k-Samples Tests
Related Samples - Cochran Q Independent Samples - 2 for ksamples
Nominal
Ordinal
- T-test - Z test
- Repeatedmeasured ANOVA
a. Excel: cng c Correlation, Anova v Regression trong chc nng Data Analysis
b. SPSS: cc cng c Compare Means v Nonparametric Tests
61
62
5.7 Mt s p dng c th
1. One-Sample T Test
5.7 Mt s p dng c th
1. One-Sample T Test
V d 1 (Parametric test)
C s liu tc tng doanh s ca 9 doanh nghip. Tc tng trng chun l 6,5%/nm. Gi thit: tc tng trng doanh s bnh qun ca 9 doanh nghip khng khc bit vi tc chun (6,5%/nm).
64
63
5.7 Mt s p dng c th
1. One-Sample T Test. V d 1 (parametric test)
5.7 Mt s p dng c th
1. One-Sample T Test
Analyze Compare Means One-Sample T Test (TI SAO?)
65
66
5.7 Mt s p dng c th
1. One-Sample T Test
Analyze Compare Means One-Sample T Test
5.7 Mt s p dng c th
1. One-Sample T Test
Analyze Compare Means One-Sample T Test
5.7 Mt s p dng c th
2. One-Sample Chi-Square Test
5.7 Mt s p dng c th
2. One-Sample Chi-Square Test
V d 2 (Nonparametric test)
S liu iu tra s dng xe my. Gi thit H0: tt c cc nhn hiu xe my u c c hi c ngi s dng xe la chn nh nhau.
Analyze Nonparametric Tests Chi-Square
69
70
5.7 Mt s p dng c th
Ta c 100 quan st v 10 nhn xe my. C hi mi nhn xe c chn l 10%, v s lng k vng l 10 xe/nhn hiu. Tuy nhin, s khc bit gia N quan st v N k vng cho tng nhn xe l ln.
5.7 Mt s p dng c th
3. Two-Sample T Test
Vi P value < 0.05, ta bc b gi thit Ho v pht biu l cc nhn hiu xe my c ngi s dng la 71 chn khc bit nhau.
5.7 Mt s p dng c th
3. Two-Sample T Test
5.7 Mt s p dng c th
3. Two-Sample T Test
5.7 Mt s p dng c th
3. Two-Sample T Test
5.7 Mt s p dng c th
3. Two-Sample T Test
75
76
5.7 Mt s p dng c th
3. Two-Sample T Test
Independent Samples Test Lev ene's Test for Equality of Variances
5.7 Mt s p dng c th
4. Two-Sample Nonparametric Test
t-test f or Equality of Means 95% Conf idence Interv al of the Dif f erence Lower Upper -6.77 -6.66 4.92 4.81
F Age of motorbike user Equal variances assumed Equal variances not assumed 1.239
Sig. .268
t -.315 -.321
df 98 91.785
Mean Std. Error Dif f erence Dif f erence -.93 -.93 2.95 2.89
P values (Sig. (2-tailed)) cao hn = 0.05 rt nhiu. Ta chp nhn gi thit v din gii l khng c s khc bit v tui trung bnh gia ngi s dng xe my l Nam v N.
77
78
5.7 Mt s p dng c th
4. Two-Sample Nonparametric Test
5.7 Mt s p dng c th
4. Two-Sample Nonparametric Test Mann-Whitney Test
Test Statisticsa Mot obike Names Mann-Whit ney U 1200.000 Wilcoxon W 2970.000 Z -. 067 Asy mp. Sig. (2-t ailed) .946 a. Grouping Variable: User gender
Kt lun: chp nhn gi thit v pht biu rng s la chn nhn hiu xe my gia ngi s dng nam v n l nh nhau.
79
80
5.7 Mt s p dng c th
5. One-Way ANOVA (Parametric Test)
5.7 Mt s p dng c th
5. One-Way ANOVA (Parametric Test)
Phng php thng k kim nh gi thit l cc trung bnh ca cc dn s bng nhau l Phn tch phng sai - analysis of variance (ANOVA). One-way ANOVA s dng cc m hnh 1 yu t, cc nh hng c nh so snh nh hng ca mt nghim thc (treatment) hoc mt yu t (factor) trn mt bin ph thuc v lin tc.
81
V d 5. S liu iu tra s dng xe my Gi thit: Khng c s khc bit gia cc ngi s dng xe my cc nhm tui khc nhau v s ngy s dng bnh qun trong thng.
Analyze Compare Means One-Way ANOVA
82
5.7 Mt s p dng c th
5. One-Way ANOVA (Parametric Test)
5.7 Mt s p dng c th
5. One-Way ANOVA (Parametric Test)
83
84
5.7 Mt s p dng c th
5. One-Way ANOVA (Parametric Test)
ANOVA Number of used day s in a month Sum of Squares 1428.944 3987.806 5416.750 df 5 94 99 Mean Square 285.789 42. 423 F 6. 737 Sig. .000
5.7 Mt s p dng c th
5. One-Way ANOVA (Parametric Test)
Nu mb er o f used d ays in a mo n th Subset f or alpha = .05 1 2 3 14. 47 17. 96 17. 96 18. 33 18. 33 22. 62 22. 62 24. 12 24. 12 26. 14 .695 .198 .769 14. 47 17. 96 17. 96 18. 33 18. 33 22. 62 22. 62 24. 12 26. 14 .175 .101 .215
P value < 0.05. Kt lun: bc b gi thit; Pht biu rng c s khc bit gia cc ngi s dng xe my cc nhm tui khc nhau v s ngy s dng bnh qun trong thng
85
Age groups a,b Tuk ey HSD under 60 under 50 under 20 under 30 under 40 older t han 60 Sig. a,b Duncan under 60 under 50 under 20 under 30 under 40 older t han 60 Sig.
N 19 25 6 26 17 7 19 25 6 26 17 7
Means f or groups in homogeneous subs ets are display ed. a. Uses Harmonic Mean Sam ple Size = 12.013. b. The group sizes are unequal. The harm onic mean of the group sizes is used. Ty pe I error lev els are not guarant eed.
86
5.7 Mt s p dng c th
5. One-Way ANOVA (Parametric Test)
5.7 Mt s p dng c th
5. One-Way ANOVA (Parametric Test)
Hnh. Phn b s ngy s dng xe my bnh qun trong thng theo tui ca ngi s dng
Value 14,5
17,9 18,3 22,6
Grouping a
ab ab abc
24,1 26,1
abc abc
87 88
5.7 Mt s p dng c th
6. Nonparametric Test for k-Independent Samples
5.7 Mt s p dng c th
6. Nonparametric Test for k-Independent Samples
V d 6. S liu iu tra s dng xe my Gi thit: Khng c s khc bit gia cc ngi s dng xe my cc nhm tui khc nhau v nhn hiu xe.
Analyze Nonparametric Tests k Independent Samples
89
90
5.7 Mt s p dng c th
6. Nonparametric Test for k-Independent Samples
Kruskal-Wallis Test
Ranks Age groups under 20 under 30 under 40 under 50 under 60 older t han 60 Tot al N 6 26 17 25 19 7 100 Mean Rank 46. 25 49. 40 50. 62 55. 66 45. 87 52. 07