07 - Lent - Topic 2 - Generalized Method of Moments, Part II - The Linear Model - mw217

Notes
M320
Cross Section and Panel Data Econometrics
Topic 2: Generalized Method of Moments
Part II: The Linear Model
Dr. Melvyn Weeks
Faculty of Economics and Clare College
University of Cambridge
Outline
Outline I
IV and OLS Estimators: Revision

Small Sample Issues
IV and 2SLS Estimators

Just Identified
Over Identified: The Generalised IV Estimator
GMM Estimators for the Linear Model

Moment-Based Estimation
Types of GMM Estimators
Summary
GMM Estimators in the Linear Model
Notes
Outline
Notes
Readings
Cameron, A. C., and P. K. Trivedi (2005). Microeconometric
Methods and Applications. Cambridge University Press.
Chapter 6.
A. Colin Cameron and P.K. Trivedi. Microeconometrics Using
Stata. Stata Press, 2009. URL
http://www.stata.com/bookstore/mus.html
Hayashi, F. (2000) Econometrics. Princeton University Press,
Princeton.
Wooldridge, J. M. (2001). Applications of Generalized Method
of Moments Estimation, Journal of Economic Perspectives,
15(4), 87.100. 1
Outline
Notes
Part II: GMM and Linear Models

1
The Linear Model

IV: Finite and Large Sample Properties (Review)
IV as MOM
2SLS [Generalised IV (GIVE)] as GMM
Outline
Notes
Road Map
In Generalized Method of Moments - Part II: The Linear Model,
we consider the gmm as a canonical estimator.
In doing this we show how ols, iv and 2sls are special cases of
the more general gmm estimator.
We show how just identified and over-identified models may be
represented as moment estimators. In the just-identified case we
also explicitly show that the gmm criterion function may be set to
zero, whereas in the over-identified case we minimise the distance
from zero.
We also provide some background material on the small-sample
properties of the iv and 2sls estimators.
Notes
We first provide a brief overview of ols and iv estimators.

We consider the unbiasedness of the ols estimator under certain
conditions and the small sample bias of the iv estimator.
We show how it is difficult to obtain the sample mean of the iv
estimator
This leads us to the potential small (and large) sample bias that
might be induced by weak instruments.
Notes
OLS
Proposition
If E [X 0 ] = 0 ols is an unbiased estimator of .
Proof.
= (X 0 X )1 X 0 y
E [ ] = + E [(X 0 X )1 X 0 ]
= + (X 0 X ) 1 E [ X 0 ]
(1)
Notes
Proposition
If E [X 0 ] 6= 0 the ols estimator is biased and inconsistent
estimator of
Proof.
= (X 0 X )1 X 0 y
E [ ] = + E [(X 0 X )1 X 0 ]
= + E [(X 0 X )1 X 0 (X )] 6=
for E [|X ] = (X )
Question: why E [(X 0 X )1 X 0 (X )] cannot be factored, as in (1)?
The IV Estimator
Notes
= ( Z 0 X ) 1 Z 0 ( X + )
= ( Z 0 X ) 1 Z 0 X + ( Z 0 X ) 1 Z 0
bIV
Proposition
The IV estimator is biased in small samples.
Can we utilise the same sort of proof as used for the unbiasedness
of the ols estimator?
E [ bIV ] = E [(Z 0 X )1 Z 0 X + (Z 0 X )1 Z 0 ]
=
=
=
6=
(Z 0 X )1 Z 0 X + E X ,Z , [(Z 0 X )1 Z 0 ]
+ EX ,Z , [(Z 0 X )1 Z 0 ]
+ EX ,Z [(Z 0 X )1 [E [|Z , X ]]
+ (Z 0 X ) 1 E [ Z 0 ]
(2)
(3)
Notes
Note that the unconditional expectation wrt EX ,Z , [.] in (2) is

obtained by first taking expectations wrt given Z , X in (3)
What if we imposed E [|Z , X ] = 0?
But this is no use since it implies E [|X ] = 0, thereby negating a
requirement for an instrument in the first place.
What if we exploit
bIV = + N 1 Z 0 X
10
1
N 1 Z 0
In what way might large sample arguments help us here?
Notes
Lets consider the following simple case, where we make the

assumption that E [ i |Xi ] 6= 0.
A single instrument Zi is available.
Yi = 1 + 2 Xi + i
N
i =1 (Zi Z )(Yi Y )
bIV
=
2
N
i =1 (Zi Z )(Xi X )
Proposition
bIV
2 is consistent provided that ZX is nonzero
11
Notes
bIV
2
=
=
N
i =1 (Zi Z )(Yi Y )
N
i =1 (Zi Z )(Xi X )
N
i =1 (Zi Z )([ 1 + 2 Xi + i ] [ 1 + 2 X + ])
N
i =1 (Zi Z )(Xi X )
N
i =1 ( 2 (Zi Z )(Xi X ) + (Zi Z )( i ))
N
i =1 (Zi Z )(Xi X )
N (Z Z )( i )
= 2 + Ni =1 i
i =1 (Zi Z )(Xi X )
Nothing much can be said about the distribution of bIV

2 in small
samples
However, we observe that the iv estimator is equal to the true
value plus an error, which under certain conditions will vanish as N
becomes large
12
Notes
Divide the numerator and denominator by N so that they both

have limits, then we can take plims
plim bIV
2
= 2 +
plim
plim
= 2 + Z
ZX
N
1
N i =1 (Zi Z )( i
N
1
N i =1 (Zi Z )(Xi
)
X )
Slutskys theorem allows us (as opposed to the situation when

taking expectations) to split the problem. We can the evaluate the
limit of numerator and denominator separately
13
Small Sample Issues
Notes
Small Sample Issues

The iv estimator requires Z = 0. In small samples and Z may
be weakly correlated.
In this instance the iv estimator can have large (asymptotic bias)
even if the correlation is moderate.
To see this we write the probability limit of the iv estimator as
Cov (Z , )
plim IV ,2 = 2 +
Cov (Z , X )
Corr (Z , )
= 2 +
Corr (Z , X ) X
0
= 2 +
Cov (Z , X )
14
(4)
(5)
(6)
Small Sample Issues
Notes
Comments
1
In small samples it is not possible to say much about the

distribution of IV ,2
From (6) we observe that if Z is distributed independent of

then IV ,2 is consistent for 2 .
In small (or even large) samples and Z may be weakly

correlated.
In this instance the iv estimator can have large (asymptotic
bias) even if the correlation is moderate
This will obviously depend upon the correlation between Z
and X ; this is the problem of weak instruments.
15
Small Sample Issues
Notes
Comments - cont.
4. Even in the context of large sample results, it is not
unequivocally better to use iv rather than ols.
To see this compare (5) with the plim of the ols estimator
OLS,2 which we write as
plim OLS,2 = 2 + Corr (X , )

X
(7)
By finding a relationship between plim OLS,2 and plim IV ,2 ,

under what circumstances is iv preferred to ols on
asymptotic grounds?
16
Notes
Below we briefly review the iv approach to identification in the

presence of exact and overidentified models.
In the following section we show that the gmm estimator for the
linear model is canonical in that it nests the iv (mom) and 2sls
estimators as special cases.
We also show that in the case of an exactly identified model, the
gmm criterion function can be set exactly to zero, just as the sum
of residuals for the OLS estimator is exactly zero.
In this instance the sample realisation of the criterion function is
not a random variable and it is not possible to test the exogeneity
of instruments.
17
Notes
Consider a first stage regression based on a linear combination of

instruments
X = Z + u
b = (Z 0 Z )1 Z 0 X
b = PZ X
X
(8)
= Z ( Z 0 Z ) 1 Z 0 X
b as instruments.
Using X
bIV
18
= ( X 0 P Z X ) 1 X 0 P Z y
= (X 0 Z ( Z 0 Z ) 1 Z 0 X ) 1 X 0 Z ( Z 0 Z ) 1 Z 0 y
(9)
Just Identified
Notes
Just-identified: M = k
The estimator with M = k (M the of cols. of Z ) is
often called the Instrumental Variable Estimator.
X 0 Z is a square matrix since M = k.
(X 0 Z (Z 0 Z )1 Z 0 X )1 can then be decomposed as

(Z 0 X )1 (Z 0 Z )(X 0 Z )1 ,
(10)
such that the iv estimator may then be rewritten as

IV
= (Z 0 X )1 (Z 0 Z )(X 0 Z )1 X 0 Z (Z 0 Z )1 Z 0 y
= (Z 0 X ) 1 Z 0 y .
19
Just Identified
Notes
In the just-identified case we observe that the weighting matrix

P Z = Z (Z 0 Z )1 Z 0 falls out.
In situations where parameters are exactly identified we have just
enough moment conditions to estimate .
Consequence: the minimum of a generalised distance measure
(GMM) is exactly zero - all sample moments can be set to zero by
appropriate choice of .
There is therefore no need to weight the individual moments in
order to minimise a weighted sum.
20
Over Identified: The Generalised IV Estimator
Notes
Over-identified: M > k
The estimator is called the 2sls or Generalized
Instrumental Variable Estimator (give).
X 0 Z is not a square matrix since for M > k
(X 0 Z ( Z 0 Z ) 1 Z 0 X ) 1
cannot be decomposed.
The iv estimator is then given by
IV
= (X 0 Z ( Z 0 Z ) 1 Z 0 X ) 1 X 0 Z ( Z 0 Z ) 1 Z 0 y
21
Notes
IV as Moment Estimators
We have motivated the iv estimator based on a transformed linear
regression model of the form Z 0 y = Z 0 X + Z 0 .
Alternative derivation: minimise a quadratic form of vector
moments: functions of parameters and data.
The gmm estimator is obtained by minimising a quadratic form in
the analogous sample moments: n1 [Z 0 y Z 0 X ]
Ignoring n1 then the gmm estimator is defined as
= arg min[(Z 0 y Z 0 X )0 C N (Z 0 y Z 0 X )]
22
(11)
Notes
Solving (11) exactly or by minimising a weighted quadratic

function depends on whether the system of equations is over or
exactly identified.
CN is an M M pd symmetric weighting matrix; it tells us how
much weight to attach to which (linear combinations) of the
sample moments.
In general CN will depend upon the sample size N, because it may
itself be an estimate.
Again we note the use of the Generalised prefix in gmm.
Given that we cannot set individual sample moments equal to
population counterparts, we utilise a weighting matrix CN , which
weights each moment such that the sample moments are as close
as possible to zero.
23
Notes
A class of GMM estimators will depend on what is assumed sbout

the distribution of i
IV: M = K , errors homoscedastic; Var( i |z i ) = 2 I N .
bMOM
= (Z 0 X ) 1 (X 0 Z ) 1 X 0 Z Z 0 y
= (Z 0 X ) 1 Z 0 y
2SLS: M > K , errors homoscedastic: Var( i |z i ) = 2 I N .

b2SLS = [X 0 Z (Z 0 Z )1 Z 0 X ]1 X 0 Z (Z 0 Z )1 Z 0 y
GMM M > K , errors not restricted

1 0
bGMM = X 0 Z C N Z 0 X
X Z C N Z 0y
24
Notes
2SLS
If (0, 2 I ) the covariance matrix of the moment conditions is
Var(Z 0 (y X b)) = Var(Z 0 ) = 2 Z 0 Z .
An optimal weighting matrix is then
CN =
1
N
z i z i0
! 1
i =1
The resulting GMM estimator is

b2SLS = [X 0 Z (Z 0 Z )1 Z 0 X ]1 X 0 Z (Z 0 Z )1 Z 0 y
This estimator is often referred to as the
Generalised Instrumental Variables Estimator (give).
25
Notes
Below we consider the following two propositions:

Proposition
I: If Var(Z 0 ) = 2 Z 0 Z then
bGMM = b2SLS = (X 0 Z (Z 0 Z )1 Z 0 X )1 X 0 Z (Z 0 Z )1 Z 0 y .
Proposition
II: If M = K then GMM = IV and QC ( ) can be set to 0
26
Proposition
Notes
bGMM = b2SLS = (X 0 Z (Z 0 Z )1 Z 0 X )1 X 0 Z (Z 0 Z )1 Z 0 y .
Proof.
QC ( ) = (y X b)0 P Z (y X b)
= (y X b)0 Z (Z Z0 Z )1 Z 0 (y X b)
= [Z 0 (y X b)](Z 0 Z )1 [Z 0 (y X b)]
= [Z 0 (y X b)]C N [Z 0 (y X b)]
= y 0 P Z y + b0 X 0 P Z X b 2 b0 X 0 P Z y
QC ( )
b
(12)
(13)
(14)
(15)
b0 X P Z X b 2 b0 X P Z y
b
b
= 2X 0 P Z X b 2X 0 P Z y = 0
= 2X 0 Z C N Z 0 X b 2X 0 Z 0 C N Z 0 y = 0.
(16)
(17)
27
Notes
(16) and (17) are derived, respectively, from (13) and (14)
Both (16) and (17) represent systems of M equations and k
unknowns, where X 0 Z is k M.
The solution to (16) is given by
GMM
28
= (X 0 P Z X ) 1 X 0 P Z y

1 0
= X 0 Z ( Z 0 Z ) 1 Z 0 X
X Z ( Z 0 Z ) 1 Z 0 y
= 2SLS
Just Identified If M = k we may solve (11) exactly

GMM = IV = MOM = (Z 0 X )1 Z 0 y
Notes
Proposition
GMM = IV and QC ( ) can be set to 0
Proof.
Substituting IV into (15) then
QC ( ) = y 0 P Z y + 0IV X P Z X IV 2 0IV X 0 P Z y
= y 0 Z ( Z 0 Z ) 1 Z 0 y
+ y 0 Z ( X 0 Z ) 1 X 0 Z ( Z 0 Z ) 1 Z 0 X ( Z 0 X ) 1 Z 0 y
20 Z ( X 0 Z ) 1 X 0 Z ( Z 0 Z ) 1 Z 0 y
= y 0 Z ( Z 0 Z ) 1 Z 0 y + y 0 Z ( Z 0 Z ) 1 Z 0 y
2y 0 Z (Z 0 Z )1 Z 0 y = 0
29
Two Step GMM
Notes
The most efficient (feasible) GMM estimator based upon

b 1 .
E (z i0 i ) = 0 uses weight matrix, say V
b is constructed using fitted residuals, obtained from a first step
V
using a consistent estimator of , which is often the 2SLS
estimator:
b
V
1
N
z i0 (y i x i b2sls )(y i x i b2sls )0 z i
i =1
where V is given by
V
= plimN
1
N
z i0 b i b0i z i
i =1
This gives the two-step GMM estimator

1 0
b2sGMM = X 0 Z C N Z 0 X
X Z C N Z 0y
b
where C N = V
30
Notes
The efficiency gains of the GMM estimator relative to the

traditional IV/2SLS estimator derive from the overidentifying
restrictions of the model, the use of the optimal weighting matrix,
and the relaxation of the i.i.d. assumption.
For an exactly-identified model, the efficient GMM and traditional
IV/2SLS estimators coincide.
Under the assumptions of conditional homoskedasticity and
independence, the efficient GMM estimator is the traditional
IV/2SLS estimator.
31
Summary
GMM Estimators in the Linear Model
Notes
Summary: GMM Estimators in the Linear IV Model

1 0
GMM GMM = X 0 Z C N Z 0 X
X Z C N Z 0y

1 0
2SLS, GIVE 2SLS = X 0 Z (Z 0 Z )1 Z 0 X
X Z ( Z 0 Z ) 1 Z 0 y
0 1 0
IV
IV = Z X ] Z y
32

07 - Lent - Topic 2 - Generalized Method of Moments, Part II - The Linear Model - mw217

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

07 - Lent - Topic 2 - Generalized Method of Moments, Part II - The Linear Model - mw217

Uploaded by

Copyright:

Available Formats

Notes

IV and OLS Estimators: Revision

IV and 2SLS Estimators

GMM Estimators for the Linear Model

Part II: GMM and Linear Models

The Linear Model

IV and OLS Estimators: Revision

We first provide a brief overview of ols and iv estimators.

IV and OLS Estimators: Revision

IV and OLS Estimators: Revision

IV and OLS Estimators: Revision

IV and OLS Estimators: Revision

Note that the unconditional expectation wrt EX ,Z , [.] in (2) is

IV and OLS Estimators: Revision

In what way might large sample arguments help us here?

Lets consider the following simple case, where we make the

IV and OLS Estimators: Revision

Nothing much can be said about the distribution of bIV

IV and OLS Estimators: Revision

Divide the numerator and denominator by N so that they both

Slutskys theorem allows us (as opposed to the situation when

IV and OLS Estimators: Revision

Small Sample Issues

Small Sample Issues

IV and OLS Estimators: Revision

Small Sample Issues

In small samples it is not possible to say much about the

From (6) we observe that if Z is distributed independent of

In small (or even large) samples and Z may be weakly

IV and OLS Estimators: Revision

Small Sample Issues

plim OLS,2 = 2 + Corr (X , )

By finding a relationship between plim OLS,2 and plim IV ,2 ,

IV and 2SLS Estimators

Below we briefly review the iv approach to identification in the

IV and 2SLS Estimators

Consider a first stage regression based on a linear combination of

IV and 2SLS Estimators

(X 0 Z (Z 0 Z )1 Z 0 X )1 can then be decomposed as

such that the iv estimator may then be rewritten as

IV and 2SLS Estimators

In the just-identified case we observe that the weighting matrix

IV and 2SLS Estimators

Over Identified: The Generalised IV Estimator

GMM Estimators for the Linear Model

GMM Estimators for the Linear Model

Solving (11) exactly or by minimising a weighted quadratic

GMM Estimators for the Linear Model

Types of GMM Estimators

A class of GMM estimators will depend on what is assumed sbout

2SLS: M > K , errors homoscedastic: Var( i |z i ) = 2 I N .

GMM Estimators for the Linear Model

Types of GMM Estimators

The resulting GMM estimator is

GMM Estimators for the Linear Model

Types of GMM Estimators

Below we consider the following two propositions:

GMM Estimators for the Linear Model

Types of GMM Estimators

GMM Estimators for the Linear Model

Types of GMM Estimators

GMM Estimators for the Linear Model

Types of GMM Estimators