Professional Documents
Culture Documents
ECO3021S 2013
Weeks 1 and 2
Econometrics
Wooldridge, Appendix B and C
Fundamentals o !ro"a"ilit#
$
%at&ematical Statistics
'at&erine E#al
2
C&anges
(nlike pre)ious #ears, *e *on+t "e doing t&e la*
o iterated expectations, or t&e proo t&ereo, t&us
t&ese slides taken out o t&e originals posted on
)ula
3
&ttp,--***.lickr.com-p&otos-doug/////-
4
Some 0uestions o t&e Week
W&at is an estimator1
2o* do *e measure "ias in an estimator1
W&at is t&e dierence "et*een a cd and a pd1
3oes sample si4e matter or an estimator1
W&at is a consistent estimator1 ?
W&at are t&e ormulae or E567, Co)56,87, 9ar5671
5
What is an experiment?
:t can "e repeated infinitely,
:t &as well defined outcomes,
;&ink coin toss, t&ro* o t&e die, gender o a "a"#
at "irt&, outcomes o a randomised controlled
drug trial etc etc...
Eac& time *e perorm an experiment
5i.e. perorm anot&er trial o t&e experiment7,
*e ma# get a dierent outcome
6
W&at is a <andom 9aria"le1
:t takes on numerical values,
:ts outcome is determined "# an experiment
E.G. We toss a coin ten times and count t&e
num"er o &eads *e o"tain = t&is is our <9 6
6 can take on )alues in >0,1,2,3,?,@,A,B,/,C,10D
Eac& outcome is t&e result o a trial o t&e
experiment
7
But
#ou said numerical )alues, &o* is a coin toss *it&
&eads and tails numeric1
We must assign )alues to eac& o our outcomes,
like 1 or &eads, 0 or tails
Note
Capitals, 6,8, E used to denote random )aria"les
Smalls denote particular outcomes, x F 2, # F ?
<9s can "e discrete or continuous
8
9er# common <9 is Bernoulli
Called a binary <9, 6GBernoulli 5H7
!56 F 17 F H
!56 F 07 F 1 = H
;akes on onl# )alues o 0 and 1.
E.G.
!5an airline customer arri)es7 F 0./0
!5customer doesn+t arri)e7 F 0.2
:s it important to kno* H1 2o* can *e estimate it1
9
Discrete !"
#inite or counta"l# inite num"er o )alues
6 &as k possi"le )alues >x
1
, x
2
, I, x
k
D
Wit& !ro"a"ilit#56Fx
i
7 F p
i
or eac& x
i
,
>p
1
, p
2
, I, p
k
D
NB" Eac& 0 J p
i
J 1, Kp
i
F 1
:s a Bernoulli <9 discrete1
3o : need to tell #ou !56F07 or a Bernoulli <91
10
$robability Density #unction o 6,
5x
L
7 F !56 F x
L
7 F p
L
, L F 1,2, I, k
5x7 is t&e pro"a"ilit# t&at 6 takes on t&e )alue x.
For <9s 6 and 8, *e &a)e
6
and
8
e.g. 6 is t&e num"er o ree t&ro*s made "# a
"asket"all pla#er out o 2 attempts
8ou+re gi)en t&e pd,
507 F 0.2, 517F0.??, 527 F 0.3A
11
12
% continuous ! is a <9 *&ic& takes on an# real
)alue *it& 4ero pro"a"ilit#
i.e. a continuous <9 can take on so man#
possi"le )alues t&at t&e pro"a"ilit# t&at it takes on
an# one is 4ero.
For a continuous <9, 56 F x
i
7 F 0
:s t&at useul1
<at&er talk a"out t&e cd F5x7 F !56JF x
L
7
&umulative Distribution #unction
13
14
;&e C3F means *e can talk a"out a ran'e of
values, and t&e pro"a"ilit# t&at 6 lies in t&em.
!5a JF 6 M "7 F F5x7 F N 5x7 dx
area under t&e cur)e "et*een a and "
;&e cd is t&e integral o t&e pd 5continuous <97
What else do we (now?
0 JF F5x7 JF 1
Area under entire cur)e o F sums to 1.
For 6 a discrete <9,
C3F F !56 JF x
L
7 F Kp
L
FKpd
o)er all xJF x
L
15
E.G.
6 is num"er o ree t&ro*s made "# a "asket"all
pla#er out o 2 attempts. 6 can take on >0,1,2D
Ooing "ack to t&e pd on slide 11,
W&at is !56 PF 171
i.e. *&at is pro"a"ilit# t&at a pla#er makes at least
one ree t&ro*1
16
!roperties o C3Fs
For an# num"er c, !56 P c7 F 1 Q F5c7
F 1 = !56 J c7
For an# num"ers aJ",
!5a J 6 JF "7 F F5"7 = F5a7 F !56 J "7 = !5x J a7
We use cds onl# or continuous <9s,
&ence *e don+t dierentiate "et*een J and JF
17
)oint Distribution
or 6,8, 2 discrete <9s
x,#
5x,#7 F !56Fx, 8F#7
2 <9s are independent i
x,#
5x,#7 F
x
5x7
#
5#7
x
5x7 is called t&e mar'inal pd
18
*ndependence
: 6, 8 are independent, R so are O567 and 2587
6
1
, 6
2
, I, 6
n
discrete <9s,
R
5x
1
, x
2
, I, x
n
7 F !56
1
F x
1
, 6
2
F x
2
, I, 6
n
F x
n
7
F 5x
1
75x
2
7I5x
n
7
iff t&e 6s are independent
:n a seSuence o Bernoulli trials, are t&e coin lips
independent1
19
Binomial 3istri"ution
Oi)en 6 F 8
1
T 8
2
T I T 8
n
W&ere eac& 8
i
&as pro"a"ilit# o success H, and
t&e 8
i
are independent, t&en 6 G"inomial5n,H7,
20
&onditional 3istri"utions
Bayes ule
!58F#U6Fx7 F
8U6
5#Ux7 F
6,8
5x,#7
6
5x7
i.e.
&onditional +,- . /oint -,+
mar'inal -
21
W&at i 6, 8 are independent1 W&at can *e sa#
a"out
8U6
5#Ux7 or
6U8
5xU#7 1
!58F#U6Fx7 F
8U6
5#Ux7
F
6,8
5x,#7
6
5x7
F
6
5x7
8
5#7 F
8
5#7
6
5x7
3oes t&is &old or all t#pes o <9s1
?
22
Expected !alue
E567 F *eig&ted a)erage o)er outcomes o 6
6 a discrete <9
6 continuous <9
W&# is E567 F H *&en 6GBernoulli 5H71
23
Vote
We can get an ans*er or t&e expected )alue
*&ic& is not one o t&e outcomes. :s t&at odd1
24
$roperties of Expected !alues
g567 some unction o 6, 6 a discrete <9
6 a continuous <9
25
!roperties o Expected !alues
6, 8 discrete <9s t&en g56,87 also <9
:
;&en
26
E5c7 F c
9ar567 F 0
i t&ere is a const c, s.t. !56 F c7 F 1,
and i so, t&en E567 F c.
Corr56,87 F 0 i Co)56,87 F 0
42
&89" Q1 JF Corr56,87 JF 1.
So Corr56,87 F 1 R a perect positi)e linear
relations&ip "et*een 6 and 8, *e can *rite
8 F aT"6
Corr56,67 F Co)56,67-Z
x
2
F - 9ar567-9ar567 F 1
&86"
Correlation is in)ariant to units o measurement
Corr5a
1
6 T "
1
, a
2
8 T "
2
7 F Corr56,87 i a
1
, a
2
P 0
Corr5a
1
6 T "
1
, a
2
8 T "
2
7 F QCorr56,87 i a
1
, a
2
J 0
43
!roperties
Co)56,87 measures linear dependence
Corr56,87 F 1 R 6,8 perfectly linearly related
i.e. 8 F a T "6 or some constants a and "P0
Corr56,87 F 0 R no linear relations&ip
<emem"er t&ere "e some ot&er t#pe o nonQlinear
relations&ip.
44
9ariance o Sums
!%:"
9ar5a6 T "87 F a
2
9ar567 T "
2
9ar587 T 2a"Co)56,87
: Co)56,87 F 0, t&en,
9ar56 T 87 F 9ar567 T 9ar587
9ar56Q87 F 9ar567 T 9ar587
45
9ariance o Sums
!%=" $airwise uncorrelated variables
Oi)en >6
1
, 6
2
, ..., 6
n
D, *it& Co)56
i
, 6
L
7 F 0 or all i, L
R
9ar5a
1
6
1
T...T a
n
6
n
7 F a
1
2
9ar56
1
7 T...T a
n
2
9ar56
n
7
;&e variance of the sum is eSual to t&e sum of
the variances i all t&e a
i
F 1.
46
For 6 G Binomial5n,H7,
i.e. 6 F 81 T 82 T I T 8n,
and eac& 8 is an independent Bernoulli5H7 <9.
W&at is 9ar567?
8ou s&ould "e a"le to calculate t&e a"o)e. ;&e
onl# missing piece is t&at 9ar58
i
7 F HQH
2
or eac& 8
i
47
Conditional Expectation
EW8U6FxX
e.g. W&at is expected income, or t&ose *it&
education F 12 #ears1
We still do a *eig&ted sum, "ut the wei'hts are
now different = t&e# are t&e conditional pd.
We can s&o* EW8U6X grap&icall# or in a ta"le.
48
Conditional Expectation
<at&er use a unction to descri"e it or eac& le)el
o education.
E.g. E5WAOEUE3(C7 F 1.0@ T .?@E3(C
Conditional expectations can take on an# sort o
unctional orm.
49
50
51
!roperties o CE
&E9"
EWc567U6X F c567
i.e. i *e kno* 6, *e also kno* c567. Functions o
6 "e&a)e as constants *&en *e condition on 6.
&E6"
EWa5678 T "567U6X F a567EW8U6X T "567
&E:"
6,8 are independent R EW8U6X F EW8X
E.g. EW68 T 26
2
U6X F 6EW8U6X T 26
2
.
52
: (,6 independent, $ E5(7 F 0,
E5(U67 F E5(7 F 0 5"# deinition o independence7
J^ea)e out CE.?QCE.AP
Conditional 9ariance
9ar58U6Fx7 F EW8
2
U6X = 5EW8U6X7
2
E.g. 9ar5Sa)ingsU:ncome7 F ?00T0.2@:ncome
&!9"
: 6 and 8 are independent,
R
9ar58U67 F 9ar587
53
Vormal and ot&er distri"utions...
We rel# &ea)il# on normal>Gaussian distri"utions
Q to simplify pro"a"ilit# calculations, "#
assuming our 6s are normall# distri"uted
Q to conduct inference
6GVormal5Y,Z
2
7
Will "e gi)en on a ormula s&eet.
54
55
;&e normal distri"ution is s#mmetric,
so median567 F E567
Birt& *eig&ts, test scores, unemplo#ment rates
tend to ollo* a normal distri"ution.
:ncome, price )aria"les tend not to.
We can transorm )aria"les to "e normalised, e.g.
"# using log567 instead o 6. ;&en 6 is lo'
normal.
?
56
Standard Vormal EGVormal50,17
2as pd,
C3F denoted _547 = ?;@< . AB;@<. $;CD@<
_547 = &a)e ta"le o )alues, no ormula = W
_547 = ta"le O1 in text
W&at is _547 or 4 JF Q3.1 or 4 PF 3.11
57
58
!roperties
?;@< . AB;@<. $;CD@<
!5E P 47 F 1 = _547
!5EJ Q47 F !5EP47
!5aJEJ"7 F _5"7 = _5a7
!5UEUPc7 F !5EJQc7 T !5EPc7
F 2!5EPc7
F 2W1 = _547X
59
60
$N9"
: 6GVormal5Y,Z
2
7, t&en 56QY7-ZG Vormal50,17
So i 6GVormal53,?7, *&at is !56JF 171
!56JF 17
F !556 = 37-2JF 51Q37-27
F !5E JF Q17 F 0.1@/B
F !5EPF 17
F 1 = !5EJF 17
F 1 = 0./?13 F 0.1@/B
^ook at ex B.A on ! B?0.
61
$N6" 6 G Vormal5Y,Z
2
7
R a6 T " G Vormal5aY T ",a
2
Z
2
7.
$N:" Oi)en 6 and 8 Lointl# normall# distri"uted,
6,8 independent i Co)56,8 7 F 0.
$N=" An# linear com"ination o independent,
identicall# distri"uted normal <9s GVormal
NB" 8
1
, 8
2
, ..., 8
n
independent <9s,
Eac& 8
i
G Vormal5Y,Z
2
7,
R
8 "ar F 58
1
T 8
2
T... T 8
n
7-n G Vormal5Y,Z
2
-n7.
W
62
: 6GVormal51,C7,
*&at is t&e distri"ution o 8 F 26 T 31 !B?0.
63
Ot&er 3istri"utions
&hi 41uared"
^et E
i
, i F 1,2,...n,
"e independent <9s
We *rite 6GC&i SSuared5n7 or 6G6
2
n
C&i SSuared al*a#s nonQnegati)e
Vot s#mmetric
Given, E567 F n, 9ar567 F 2n
64
65
t distri"ution
^et EGVormal50,17, 6G6
2
n
, E, 6 independent
;Gt
n
; is similar to normal distn, "ut more spread out,
as t gets larger, it approac&es t&e std normal.
Given, E5;7 F 0 or nP1
9ar5;7 F n-5nQ27 or nP2
66
67
# distri"ution
6
1
G6
2
k1
, 6
2
G6
2
k2
, assume 6
1
, 6
2
independent
FGF
k1,k2
Order o t&e degrees o reedom is important
5numerator is irst7.
Note, 8ou &a)e t&e tools to calc t&e exp )alue
and )ariance o t&e t and c&i sS distri"utions
68
69
Appendix C, !opulations,
!arameters $ <andom Sampling
W&at+s a population1 And learnin'1 Statistical
inference1 4ample1
Why onl# a sample1
Eow could #ou take samples1 W&at+s a random
sample1
W&at are point and interval estimates1
3o t&e# come out t&e same no matter *&ic&
sample is dra*n1
3oes A3; presence reduce crime1 W&at *ould
#our hypotheses "e1 2o* s&-*ould #ou test
t&em1 2o* could #ou test t&em 5i.e. &o*
s&ouldn+t #ou17
70
Appendix C, !opulations,
!arameters $ <andom Sampling
Statistical Inference ! learning something about
a population given the availability of a sample
from the population
$opulation F *ell deined group o su"Lects
Fearnin' F estimation $ &#pot&esis testing
Eg, return to education, program e)aluation
71
Statistical :nerence
1. :denti# population o interest
2. Speci# a model or relations&ip o interest
3. %odels in)ol)e probability distributions
*&ic& depend on unkno*n parameters o t&e
model
!arameters determine direction $ stren'th o
relations&ips = e.g. return to education, eect o
neig&"our&ood *atc& programs on crime
72
Sampling
!ro"58
1
F #7 F !ro"58
2
F #7 F I F !ro"58
n
F #7 F 5#,H7,
8
1
, 8
2
, I, 8
n
are independent random )aria"les *it&
common pd 5#,H7 t&en >8
1
, 8
2
, I, 8
n
D is a random
sample rom 5#,H7
E.O. Famil# income rom nF100 amilies
>B20000,2/000,@?0000,1.@m, 20m, @000, I, 1000D
73
Sampling
!58
i
F 17 F H, !58
i
F 07 F 1QH