Professional Documents
Culture Documents
All of the assumptions listed in (i) to (iii) are required to show that the OLS
estimator has the desirable properties of consistency, unbiasedness and
efficiency. However, it is not necessary to assume normality (iv) to derive
the above results for the coefficient estimates. This assumption is only
required in order to construct test statistics that follow the standard
statistical distributions in other words, it is only required for hypothesis
testing and not for coefficient estimation.
(2)
Suppose that a researcher is interested in conducting Whites
heteroscedasticity test using the residuals from an estimation of (2). What
would be the most appropriate form for the auxiliary regression?
(i)
(ii)
(iii)
(iv)
The first thing to think about is what should be the dependent variable for
the auxiliary regression. Two possibilities are given in the question: u_t-
squared and u_t. Recall that the formula for the variance of any random
variable u_t is
E[u_t E(u_t)]squared
and that E(u_t) is zero, by the first assumption of the classical linear
regression model. Therefore, the variance of the random variable
simplifies to
E[u_t squared]. Thus, our proxy for the variance of the disturbances at
each point in time t becomes the squared residual. Thus, answers c and d,
which contain u rather than u-squared, are both incorrect. The next issue
is to determine what should be the explanatory variables in the auxiliary
regression. Since, in order to be homoscedastic, the disturbances should
have constant variance with respect to all variables, we could put any
variables we wished in the equation. However, Whites test employs the
original explanatory variables, their squares, and their pairwise cross-
products. A regression containing a lagged value of u as an explanatory
variable would be appropriate for testing for autocorrelation (i.e. whether
u is related to its lagged values) but not for heteroscedasticity. Thus (ii) is
correct.
(2)
Suppose that model (2) is estimated using 100 quarterly observations, and
that a test of the type described in question 4 is conducted. What would
be the appropriate 2 critical value with which to compare the test
statistic, assuming a 10% size of test?
2.71
118.50
11.07
9.24
It will be biased
It will be inconsistent
It will be inefficient
All of (a), (b) and (c) will be true
Under heteroscedasticity, provided that all of the other assumptions of the
classical linear regression model are adhered to, the coefficient estimates
will still be consistent and unbiased, but they will be inefficient. Thus c is
correct. The upshot is that whilst this would not result in wrong coefficient
estimates, our measure of the sampling variability of the coefficients, the
standard errors, would probably be wrong. The stronger the degree of
heteroscedasticity (i.e. the more the variance of the errors changed over
the sample), the more inefficient the OLS estimator would be.
Whites test
The Durbin Watson test is one for detecting residual autocorrelation, but it
is designed to pick up first order autocorrelation (that is, a statistically
significant relationship between a residual and the residual one period
ago). As such, the test would not detect third order autocorrelation (that
is, a statistically significant relationship between a residual and the
residual three periods ago). The Breusch-Godfrey test is also a test for
autocorrelation, but it takes a more general auxiliary regression approach,
and therefore it can be used to test for autocorrelation of an order higher
than one. Whites test and the RESET tests are not autocorrelation tests,
but rather are tests for heteroscedasticity and appropriate functional form
respectively.
Close to zero
Recall that the formula relating the value of the Durbin Watson (DW)
statistic, and the coefficient of first order autocorrelation, p, is:
DW is approximately equal to 2(1-p)
Thus, if DW is close to zero, the first order autocorrelation coefficient, p,
must be close to +1. A value of p close to 1 would suggest negative
autocorrelation, while a value close to +1 or 1 would suggest that we
thought there was strong autocorrelation but we didnt know whether it
was positive or negative! Such a situation would not happen in practice
because DW can distinguish between the two, and positive and negative
autocorrelation would result in completely different values of the DW
statistic.
The value of the test statistic is given at 1.53, so all that remains to be
done is to find the critical values. Recall that the DW statistic has two
critical values: a lower and an upper one. If there are 2 explanatory
variables plus a constant in the regression, this would imply that using my
notation, k = 3 and k = 3-1 = 2. Thus, we would look in the k=2 column
for the lower and upper values, which would be in the row corresponding
to n = 50 data points. The relevant critical values would be 1.40 and 1.63.
Therefore, since the test statistic falls between the lower and upper critical
values, the result is in the inconclusive region. We therefore cannot say
from the result of this test whether first order serial correlation is present
or not.
(i)
(ii)
(iii)
(iv)
16. Including relevant lagged values of the dependent variable on the right
hand side of a regression equation could lead to which one of the
following?
Two or more explanatory variables are perfectly correlated with one another
The explanatory variables are highly correlated with the error term
The explanatory variables are highly correlated with the dependent variable
Two or more explanatory variables are highly correlated with one another
18. Which one of the following is NOT a plausible remedy for near
multicollinearity?
In fact, in the presence of near multicollinearity, the OLS estimator will still
be consistent, unbiased and efficient. This is the case since none of the
four (Gauss-Markov) assumptions of the CLRM have been violated. You
may have thought that, since the standard errors are usually wide in the
presence of multicollinearity, the OLS estimator must be inefficient. But
this is not true the multicollinearity will simply mean that it is hard to
obtain small standard errors due to insufficient separate information
between the collinear variables, not that the standard errors are wrong.
Test statistics concerning the parameters will not follow their assumed
distributions.
Only assumptions labelled 1-4 in the lecture material are required to show
the consistency, unbiasedness and efficiency of the OLS estimator, and not
the assumption that the disturbances are normally distributed. The latter
assumption is only required for hypothesis testing and not for optimally
determining the parameter estimates. Therefore, the only problem that
may arise if the residuals from a small-sample regression are not normally
distributed is that the test statistics may not follow the required
distribution. You may recall that the normality assumption was in fact
required to show that, when the variance of the disturbances is unknown
and has to be estimated, the t-statistics follow a t-distribution.
Has fatter tails and a smaller mean than a normal distribution with the
same mean and variance
Has fatter tails and is more peaked at the mean than a normal distribution
with the same mean and variance
Has thinner tails and is more peaked at the mean than a normal
distribution with the same mean and variance
Add lags of the variables on the right hand side of the regression model
It is quite often the case that one or two observations that do not fit into
the pattern of all of the others cause residual non-normality. Such
observations, often termed outliers will be a long way away from the
line and will therefore have large residuals (either positive or negative).
When these residuals are used as inputs to the skewness and kurtosis
calculation, they will be raised to the third and fourth powers respectively
in the numerators of the two formulae. The result is that the tails of the
distribution can be made much bigger than they otherwise would have
been by the presence of a small number of big outliers. Thus an
appropriate response may be to remove these outliers from the sample
altogether, either by physically deleting them from the data set or by
using a dummy variables approach to knock them out one at a time. Of
course, many econometricians would question this approach and would
argue that it is better to leave the residuals as non-normal than to remove
information from the sample.
(3)
The total sample of 200 observations is split exactly in half for the sub-
sample regressions. Which would be the unrestricted residual sum of
squares?
The sum of the RSS for the first and second sub-samples
The unrestricted regression is always the one where the restriction has not
been imposed. The relevant restriction that is being tested in this case is
that the parameter values are the same in both of the sub-samples. Thus
the unrestricted regression would be the one where the parameter
estimates were allowed to be freely determined in each of the sub-
samples so that in general the two sets of parameters for the sub-samples
would be different from one another. Therefore the unrestricted RSS would
be the one where the coefficients were allowed to vary across the sub-
samples, which would be where the two regressions were estimated
separately and the RSS summed. The restricted RSS would be the one that
resulted from the regression imposing the restriction, which would be
where the coefficients were forced to be equal for the two sub-samples,
i.e. where only one regression is conducted on the whole sample together.
26. Suppose that the residual sum of squares for the three
regressions corresponding to the Chow test described in question
25 [
] are 156.4, 76.2 and 61.9. What is the value of the Chow F-test statistic?
4.3
7.6
5.3
8.6
Recall that the formula for calculating a Chow test is
(RSS (RSS1 + RSS2) / (RSS1 + RSS2)) x (T-2k) / k
Plugging the relevant numbers into the formula would give
(156.4 (76.2 + 61.9) / (76.2+61.9)) x (200 6) / 3
= 8.57.
Thus d is correct to one decimal place. In this case, the test is one of how
big is the increase in the RSS when the data are forced to be generated by
a single regression equation. T is the number of observations in the whole
sample, while k usually denotes the number of regressors in the
unrestricted regression including a constant (or the number of parameters
to be estimated in the unrestricted regression) in the standard F-test
formula. Since the unrestricted regression now comes in two parts, the
total number of parameters to be estimated in the unrestricted regression
is 2k. The number of restrictions will be k, since the restriction is that each
of the parameter values are equal across the two sub-samples.
27. What would be the appropriate 5% critical value for the test
described in questions 25 and 26? [
]
2.6
8.5
1.3
9.2
The degrees of freedom for a standard F-test are given by (m, T-k). In this
case, the total number of parameters to be estimated in the unrestricted
regression is 2k, and the number of restrictions is k, so that the
appropriate critical value would be one from an F(k, T-2k) distribution. T =
200, k = 3, so this would be an F(3,194). The closest value to this in the
table will be an F(3,200) with critical value 2.6.
29. If the two RSS for the test described in question 28 are 156.4
and 128.5, what is the value of the test statistic?
13.8
14.3
8.3
8.6
Is a well-specified model
Is a mis-specified model
By definition, a parsimonious model is one that uses as few variables as
possible to explain the important features of the data. A parsimonious
model could be either well-specified or mis-specified, while a model
including too many variables may be also termed a profligate or an over-
parametersised model.
Is a well-specified model
Is a mis-specified model
(iii) The assumption that the explanatory variables are non-stochastic will
be violated
(i) only
(i) and (ii) only
(i), (ii), and (iii) only
(iv) only
When there is measurement error in an explanatory variable, all of (i) to
(iii) could occur. So parameter estimation may be inconsistent (thus
parameter estimates will not converge upon their true values even as the
sample size tends to infinity), the parameters are always biased towards
zero, and obviously measurement error implies noise in the explanatory
variables and thus they will be stochastic.
(i) only
(iv) only
In the case where the explained variable has measurement error, there
will be no serious consequences the standard regression framework is
designed to allow for this as an error term also influences the value of the
explained variable. This is in stark contrast to the situation where there is
measurement error in the explanatory variables, which is a potentially
serious problem because they are assumed to be non-stochastic.