You are on page 1of 50

Simultaneous Equations

The goal of todays lecture


To introduce simultaneity
To discuss why the standard OLS model
doesnt work in the presence of simulaneity
To introduce identification issues
To give empirical examples

Introduction - 1
Heteroscedasticity and Autocorrelation
OLS coefficients are still unbiased though no longer
most efficient
A number of cases for which OLS returns biased
estimates
Measurement Error in the Regressors
Omitted Variables
Lagged Endogenous Variables with Autocorrelation
Simultaneity
Introduction - 2
Most economic models are simultaneous i.e.
At least two relationships between the
variables in the regression.
Good to think of cause and effect.
Macro example:
Micro example: supply and demand
Lesson: simultaneity can appear anywhere
OLS will mix up the two relationships
c = |
1
+ |
2
y
Macro Example
1. Consumption, c, is function of income, y.

c is endogenous |
2
is MPC

2. y = consumption + investment.

y is endogenous

3. Investment assumed independent of income.
i is exogenous
c = |
1
+ |
2
y
y = c + i
c
t
= |
1
+ |
2
y
t
+e
t

y
t
= c
t
+ i
t

The Structural Form
of the Statistical Model
e
t
is a random disturbance term

Identity
The model is simultaneous because we
cannot determine C or Y without knowing
the other
Jargon: C and Y are :
endogenous
jointly determined
jointly endogenous
But I (investment) is exogenous
We rely on economic intuition to tell us
whether a variable is endogenous or
exogenous -- not really a statistical issue
Single Equation:
Simultaneous Equations:
Single vs. Simultaneous
Equations
c
t
y
t
e
t
c
t
y
t
i
t
e
t
Reduced Form
For use later, useful to re-write the system
of equations in their reduced form
Solve the model
Reduced form: each equation has only one
endogenous variable on the left
method: substitute one equation into the other
Easy for this simple Macro example, more
difficult in real world cases
Note the conceptual difference between
structural and reduced forms

c
t
= |
1
+ |
2
y
t
+ e
t
y
t
= c
t
+ i
t
c
t
= |
1
+ |
2
(c
t
+ i
t
) + e
t
(1 |
2
)c
t
= |
1
+ |
2
i
t
+ e
t
c
t
= + i
t
+ e
t
(1|
2
)
(1|
2
) (1|
2
)
1
|
1
|
2
c
t
= t
11
+ t
21
i
t
+ v
t
y
t
= t
12
+ t
22
i
t


+ v
t
c
t
= t
11
+ t
21
i
t
+ v
t
We can do the same for the equation in Y
We get the reduced form of the system
Note the conceptual difference
Failure of OLS

OLS picks best fit --- a mixture of both
relationships
Will not yield correct estimate of MPC
OLS is biased and inconsistent because the
right hand side variable (y) is correlated
with the disturbance term.
1. Any change in e, leads to a change in C via
consumption equation
2. Change in consumption leads to a change in
income via the identity
3. This change in income will feed back into a
change in consumption via the consumption
equation
Thus any time there is a change in e there is
a simultaneous change in Y
c
t
= |
1
+ |
2
y
t
+ e
t
y
t
= c
t
+ i
t
2.
1.
3.
Fundamental Problem of OLS
OLS will give credit to Y for changes in e
i.e. the estimated effect of Y on C will
include also the effect of e on C
OLS will act as if a change in consumption
brought about by some random effect (e),
was due to a change in income
OLS will overstate the effect of income on
consumption i.e. the MPC
OLS will be biased and inconsistent
The Failure of Least
Squares
The least squares estimators of
parameters in a structural simul-
taneous equation is biased and
inconsistent because of the cor-
relation between the random error
and the endogenous variables on
the right-hand side of the equation.
2
2
) (
) (
) (
) )( (

=
+ =


=
y y
y y
w
e w
y y
c c y y
t
t
t
t t
t
t t
OLS
| |
Formal Proof
We can see this explicitly, if we re-write the
system of equations in their reduced form
and use the formula for the OLS estimator
Reminder: formula for OLS
First step: show that the covariance of Y
and e is not zero i.e. calculate how they
move together .
{ }
{ }
2
2
2
21 12
21 12
21 12
21 12
1
) , (
1
) ( ) ( ) (
) ( ) ( ) (
) ( ) , (
|
o
|
t t
t t
t t
t t

=
)
`

|
|
.
|

\
|

=
+ + =
+ + =
+ + =
+ + =
t t
t
t
t t t t t
t t t t t
t t t t t
t t t t t
e y Cov
e
e
E
e v E e E i e E
e v E e i E e E
e v e i e E
e v i E e y Cov
OLS formula is
Taking expectations we get


t t OLS
e w

+ = | |
|
|
.
|

\
|

=
+ =
+ =

2
) (
) (
) (
) ( ) (
y y
e y e y
E e w E
e w E
e w E E
t
t t t
t t
t t
t t OLS
|
| |
Examples (from Wooldridge)
Murder Rates and the Size of the Police
Force

t
u incpc polpc murdpc + + + =
11 10 1
| | o
rs otherfacto murdpc polpc + + =
20 2
| o
Y= F (E)
E = f( Y)

Y = a +b E+U1
E = c + dY + eSLA + U2
Examples (from Wooldridge)
Romer (1993) Inflation and Openness

1 10 1 10
) log( inf u pcinc open + + + = | o |
2 21 21 2 20
) log( ) log( inf u land pcinc open + + + + = | | o |
Identification
Biggest issue in simultaneous equations,
biggest issue in econometrics
OLS cannot distinguish between effect of Y
and effect of e
Problem is to separate these two effects or
literally identify the effect of Y on C
Micro Example
Book uses a micro economic example i.e.
Supply and Demand model
Structural model:

Demand:
Supply:

Price and quantity are endogenous (jointly
determined) and income is exogenous
s
P q c | + =
1
d
y P q c o o + + =
2 1
The model is simultaneous because:
q is a function of p (demand curve)
p is a function of q (supply curve)
OLS estimation of the demand equation will
be biased and inconsistent
The OLS estimate of o
1
will pick up the
effect of the supply curve also
Cov(p,c
d
) is not equal to zero
Problem of identification is to separate the
effect of the supply curve from that of the
demand curve
Have to do this to have hope of estimating


Illustrating the Identification Problem
P
q
Suppose we observe the following data






Is this a supply curve or a demand curve?
It looks like a supply curve
.
.
.
.
.
It could be a supply curve, i.e data is
generated by movements of the demand
curve along a supply curve -- so trace out
the supply curve
q
p
S
D
Or it could be movement in both
q
p
S
D
S
It turns out that we can estimate |
consistently, but cannot estimate the
demand curve
The reason for this is that y income is in the
demand curve but excluded from the supply
curve
As income changes we know the demand
curve will shift but the supply curve will be
fixed
Therefore if we can concentrate on those
changes in p and q that are caused by
changes in income, we can trace out the
supply curve
Exclusion Restrictions
We can identify (trace out) the supply curve
only because y is in the demand curve
equation but not in the supply curve
It is because y is excluded from the supply
curve that we can be sure that changes in y
move the demand curve only
If y was in the supply curve we could not do
this


We cannot identify (trace out) the demand
curve, because there is no variable in the
supply curve that is not in the demand curve
exclusion restrictions
General Condition for Identification of an
equation

An equation containing M endogenous
variables must exclude at least M1
exogenous variables from a given
equation in order for the parameters of
that equation to be identified and to be
consistently estimated.
Importance of Identification
Must check identification before trying to
estimate
If equation is unidentified, will not be able
to get consistent estimates of the structural
parameters
Always try to design models so that the
equations are identified
Note necessary vs. sufficient condition
Beware of Artificial Restrictions
Must justify exclusion restrictions using economic
intuition.
For example: is it reasonable that income affects
demand but not supply?
Most cases are not so obvious.
If a restriction is wrong -- no hope of getting
correct answers.
Most arguments in applied economic papers are
over the validity of these restrictions.
Indirect Least Squares
One way to estimate is to do OLS on the
reduced form




This works because no endogenous variable
on the right hand side i.e. unbiased and
consistent
c
t
= t
11
+ t
21
i
t
+ v
t
y
t
= t
12
+ t
22
i
t


+ v
t
We can then use the formulae that link the
parameters of the reduced and structural
forms to calculate the estimates of |

(1|
2
)
|
1
t
11
= t
12
=
(1|
2
)
t
22
= (1t
21
)

=
1
22
11
, 1

t
t
| =
ILS
In practice, this method is not used because
usually the link between the reduced form
and structural form is very complicated in
more realistic models
Several different structural forms may have
the same reduced form.
Difficult to get standard errors on |
Indirect Least Squares linked to the notion
of Exact Identification
Estimation- 2SLS
Two stage least squares
1. Estimate the reduced form using
OLS.


2. Do OLS on the structural form with
the actual values replaced by the fitted
values from the first stage
2
2
1
1
t t t
t t t
v y q
v y p
+ =
+ =
t
t
Why this works for the supply equation
The fitted values from the first stage are by
definition the part of the variation in p and q
that is due to changes in income
Therefore we are sure that the fitted values lie
along the supply curve --- so we just do OLS on
these values
More formally: the fitted value of p is
uncorrelated with c because it is a function
solely of y which is uncorrelated with c (i.e.
exogenous)

s t t
P q c | + =

1 t t
y P
1

t =
Why does it not work on the demand
equation?
Computer will generate an error at second stage
estimation of demand equation because
effectively the income variable will appear
twice

t t
d
t t t t
y P
y P q
1
2 1

t
c o o
=
+ + =
General 2SLS Procedure
The 2SLS procedure can be used for a
system of any degree of complication
M equations
M endogenous variables (y
1
.... y
M
)
K exogenous variables (x
1
.... x
k
)
Remember: can only estimate those
equations that pass the identification
condition
Suppose one of the equations you want to
estimate is:


First check that it is identified i.e. are
enough x variables excluded from the
equation
Estimate the reduced form for the entire
model
1 2 2 2 2 1 1 1
c o | | + + + = y x x y
M k Mk M M
k k
k k
v x x y
v x x y
v x x y
+ + + =
+ + + =
+ + + =
t t
t t
t t
....
....
....
1 1
2 1 1 21 2
1 1 1 11 1
Replace the endogenous variables in the
structural equations with their fitted values
and do OLS



Note: Possible problem with standard errors
in some computer programs
1 2 2 2 2 1 1 1

c o | | + + + = y x x y
Properties of 2SLS
Estimates are consistent
Estimates are biased
Estimates are asymptotically normal
Standard errors are not same formula as
OLS -- usually built into software
Also known as Instrumental Variables (IV)
Beware of false restrictions
Example: Market for Truffles
Structural model:
Demand:
Supply:
ps= price of substitute,
pf=rent of pig (i.e. cost
of production)
di= per capita
disposable income
d
t t t t t
di ps P q c o o o o + + + + =
4 3 2 1
s
t t t t
pf P q c | | | + + + =
3 2 1
Identification
P and Q are endogenous
pf, ps and di are exogenous ? Plausible?
Is supply identified? Why?
Is demand identified? Why?
Are the restrictions plausible? ---- very
important
Can we use 2SLS?
N.B: two subjective judgements
reasonable to say variable is exogenous
reasonable to exclude it


Stage 1: Estimate Reduced Form
Endogenous on left, all exogenous on right



See the results: note exogenous variables
are significant , R
2
is high
this is close to being the sufficient condition
a.k.a rank condition
what happens if insignificant?
p
t t t t t
q
t t t t t
v pf di ps p
v pf di ps q
+ + + + =
+ + + + =
21 32 22 12
41 31 21 11
t t t t
t t t t
Stage 2: Estimate Structural Form
Calculate the fitted values for p and q
Do OLS on




Note the signs and significance of the coef.
d
t t t t t
di ps P q c o o o o + + + + =
4 3 2 1

s
t t t t
pf P q c | | | + + + =
3 2 1

Macro Example: Klein Model


A simple macro model




What is endogenous?
w
t t t o t
I
t t t t t
c
t t t t o t
t GDP GDP Wage
Capital profits profits Inv
profits profits wage cons
c
c | | | |
c o o o o
+ + + + =
+ + + + =
+ + + + =

3 1 2 1
1 3 1 2 1 0
1 3 2 1
What is exogenous?
What are the valid instruments?
Exogenous variables: gov, tax, time
predetermined variables: lagged wages, profits,
GDP
reasonable?
What equations are identified?
Are exclusions reasonable?
NB: two subjective judgements
reasonable to say variable is exogenous
reasonable to exclude it

You might also like