Lecture 10 Serial Correlation

In this lecture, you will learn the following: 1. What is the nature of autocorrelation? 2. What are the theoretical and practical consequences of autocorrelation? 3. Since the assumption of nonautocorrelation relates to the unobservable disturbance i , how does one know that there is autocorrelation in any given situation? 4. What are the remedies for the problem of autocorrelation?


Nature of Autocorrelation

Below are some concepts about the independence and serial correlation (or autocorrelation) for a time series. Serial independence: error terms t and s , for dierent observations t and s, are independently distributed. When one deals with time series data, this assumption is frequently violated. Error terms for time periods not too far apart may be correlated. Serial correlation (or autocorrelation): error terms t and s , for t = s, are correlated. This property is frequently observed from time series data. 1



Besides, three factors can also lead to serially correlated errors. They are: (1) omitted variables, (2) ignoring nonlinearities, and (3) measurement errors. For example, suppose a dependent variable Yt is related to the independent variables Xt1 and Xt2 , but the investigator does not include Xt2 in the model. The eect of this variable will be captured by the error term t . Because many time series exhibit trends over time, Xt2 is likely to depend on Xt1,2 , Xt2,2 , . . .. This will translate into apparent correlation between t and t1 , t2 , . . ., thereby violating the serial independence assumption. Thus, growth in omitted variables could cause autocorrelation in errors. Serial correlation can also be caused by misspecication of the functional form. Suppose, for example, the relationship between Y and X is quadratic but we assume a straight line. Then the error term t will depend on X 2 . If X has been growing over time, t will also exhibit such growth, indicating autocorrelation. Systematic errors in measurement can also cause autocorrelation. For example, suppose a rm is updating its inventory in a given period. If there is a systematic error in the way it was measured, cumulative inventory stock will reect accumulated measurement errors. This will show up as serial correlation. Example: Consider the consumption of electricity during dierent hours of the day. Because the temperature patterns are similar between successive time periods, we can expect consumption patterns to be correlated between neighboring periods. If the model is not properly specied, this eect may show up as high correlation among errors from nearby periods. Example: Consider stock market data. The price of a particular security or a stock market index at the close of successive days or during successive hours is likely to be serially correlated.


Serial Correlation of the First Order

If serial correlation is present, then Cov(t , ts ) = 0 for t = s; that is, the error for the period t is correlated with the error for the period s. There are c Yin-Feng Gau 2002 ECONOMETRICS



many forms of process to capture the serial correlation. Two basic processes are the atuoregressive (AR) process and the moving average (MA) process. 2 Given ut is a white noise, ut (0, u ), the pth order of AR process and the q -th order of MA process are dened as follows: AR(p): t = 1 t1 + 2 t1 + + p tp + ut MA(q ): t = ut + 1 ut1 + + q utq In this chapter, we only discuss the serial correlation in the form of autoregression. Specify the regression model with serial correlated error as: yt = 0 + 1 xt + t t = t1 + ut ; 1 < < 1 is called the rst-order autocorrelation coecient. The error term described above follows a rst-order autoregressive process [AR(1)] . The white noise ut is assumed to satisfy the following conditions. DEFINITION OF WHITE NOISE with ZERO MEAN: {ut , t = 1, 2, , T } are independently and identically distributed with zero 2 mean and constant variance so that E(ut ) = 0, E(u2 t ) = u < , and E(ut uts ) = 0 for s = 0. REMARKS: By assuming ut as a white noise series with zero mean, t is correlated with all past errors. (reason: t depends on t1 , so they are correlated. Though t does not depend directly on t2 , it does do so indirectly through t1 because t1 depends on t2 . ) positive autocorrelation: when the covariance is positive. negative autocorrelation: when the covariance is negative.
2 s , for s 0. Cov(t , s ) = u


Consequences of Ignoring Serial Correlation

If we ignore the serial correlation in error, the impacts on the OLS estimates are as follows: c Yin-Feng Gau 2002 ECONOMETRICS



OLS estimates (and forecasts based on them) are unbiased and consistent even if the error terms are serially correlated. The problem is with the eciency of the estimates. In the proof of the Gauss-Markov Theorem that established eciency, one of the steps involved minimization of the variance of the linear combination at t : Var at t =
2 a2 t + t=s

at as Cov(t , s )

where the summation is over all t and s that are dierent. If Cov(t , s ) = 0, the second term on the right-hand side will not vanish. Therefore, the best linear unbiased estimator (BLUE) that minimizes Var( at t ) will not be the same as the OLS estimator. That is, OLS estimates are not BLUE and are hence inecient. Thus the consequences of ignoring autocorrelation are the same as those of ignoring heteroskedasticity, namely, the OLS estimates and forecasts are unbiased and consistent, but are inecient. If the serial correlation in t is positive and the independent variable Xt grows over time, then the estimated residual variance ( 2 ) will be an underestimate and the value of R2 will be an overestimate. In other words, the goodness of t will be exaggerated and the estimated standard errors will be smaller than the true standard errors. In the general case, the variances of the OLS estimates for regression coecients will be biased.


Eects on Tests of Hypotheses

In the case in which the serial correlation in t is positive and the independent variable xt grows over time, estimated standard errors will be smaller than the true standard errors, and hence the estimated standard errors will be underestimated. Therefore, the t-statistics will be overestimates a regression coecient that appears to be signicant may not really be so Eects: The estimated variances of the parameters will be biased and inconsistent. Thus the t- and F -tests are no longer valid. c Yin-Feng Gau 2002 ECONOMETRICS




Eect on Forecasting

Forecasts based on OLS estimated will be unbiased. But forecasts are inecient with larger variances. By explicitly taking into account the serial correlation among residuals, it is possible to generate better forecasts than those generated by the OLS procedure. Suppose we ignore the AR(1) serial correlation and obtain OLS esti. The OLS prediction would be y 0 + 1 xt . Howmates and t = ever, in the case of rst-order serial correlation, t is predictable from t1 + ut , provided can be estimated (call it ). Once we have et = u t1 , the residual for the previous period ( ut1 ) is known at time t. Therefore, the AR(1) prediction will be 0 + 1 xt + 0 + 1 xt + 0 1 xt1 yt = ut1 = yt 0 1 xt1 . Thus yt by making use of the fact that u t1 = yt1 will be more ecient than that obtained by the OLS procedure. The procedure for estimating is described below.

PROPERTY If serial correlation among the stochastic disturbance terms in a regression model is ignored and the OLS procedure is used to estimate the parameters, the following properties hold: 1. The estimates and forecasts based on them will still be unbiased and consistent. The consistency property does not hole, however, if lagged dependent variables are included as explanatory variables. 2. The OLS estimates are no longer BLUE and will be inecient. Forecasts will also be inecient. 3. The estimated variances of the regression coecients will be biased, and hence tests of hypotheses are invalid. If the serial correlation is positive and the independent variables Xt is growing over time, then the standard errors will underestimate of the true values. This means that the computed R2 will be an overestimate, indicating a better t c Yin-Feng Gau 2002 ECONOMETRICS



than actually exists. Also, the t-statistics in such a case will tend to appear more signicant than they actually are.


Testing for First-Order Serial Correlation

The Residual Plot

Residual plot: a graph of the estimated residuals et against time (t). If successive residuals tend to cluster on one side of the zero line of the other, it is a graphical indication of the presence of serial correlation. As the rst step toward identifying the presence of serial correlation, it is a good practice to plot et against t and look for the clustering eect.


The Durbin-Watson Test

Durbin and Watson (1950, 1951): For the multiple regression model with AR(1) error: yt = 0 + 1 xt1 + 2 xt2 + + k xtk + t t = t1 + ut 1<<1 Durbin-Watson statistic is calculated by below steps: STEP 1: Estimate the model by OLS and compute the residuals et as yt 0 1 xt1 k xtk . STEP 2: Compute the Durbin-Watson statistic: d=
T 2 t=2 (et et1 ) T 2 t=1 et

It is shown later that 0 d 4. The exact distribution of d depends on , which is unknown, as well as on the observations on the xs. Durbin and Watson (1950) showed that the distribution of d is bounded by two limiting distributions. See Savin and White (1977) for the critical values for the limiting distributions of d, namely dU and dL , for dierent sample size T and the number of coecients k , not counting c Yin-Feng Gau 2002 ECONOMETRICS



the constant term. These are used to construct critical regions for the Durbin-Watson test. STEP 3a: To test H0 : = 0 against H1 : > 0 (one-tailed test), we at rst have to nd the critical values for the Durbin-Watson statistic: dL and dU . Reject H0 if d dL . If d dU , we cannot reject H0 . If dL < d < dU , the test is inconclusive. STEP 3b: To test for negative serial correlation (that is, for H1 : < 0), use 4 d. This is done when d is greater than 2. If 4 d dL , we conclude that there is signicant negative autocorrelation. If 4 d dU , we conclude that there is no negative autocorrelation. The test is inconclusive if dL < d < dU .

REMARKS: The inconclusiveness of the DW test arisen from the fact that there is no exact small-sample distribution for the DW statistic d. When the test is inconclusive, one might try the Lagrange multiplier test described next. EXPLANATION: from the estimated residuals we can obtain an estimate of the rst-order serial correlation coecient as =
T t=2 et et1 T 2 t=1 et

This estimate is approximately equal to the one obtained by regressing et against u t1 without a constant term. It can be shown that DW statistic d is approximately equal to 2(1 ) d 2(1 ) Because can range from 1 to +1, the range for d is 0 to 4. When is 0, d is 4. Thus, a DW statistic of nearly 2 means there is no rst-order serial correlation. A strong positive autocorrelation means is close to +1. This indicates low values of d. Similarly, values of d close to 4 indicate a strong negative correlation; that is, is close to 1. The DW test is invalid if the right-hand side of regression equation includes lagged dependent variables: yt1 , yt2 , . . .. c Yin-Feng Gau 2002 ECONOMETRICS




The Lagrange Multiplier Test

The LM statistic is useful in identifying serial correlation not only of the rst order but of higher orders as well. Here we conne ourselves to the rst-order case. The general case of AR(p) is discussed later. yt = 0 + 1 xt1 + 2 xt2 + + k xtk + t1 + ut The test for = 0 can be treated as the LM test for the addition of the variable t1 (which is unknown, and hence one would use et1 instead). Steps for Carrying Out the LM Test: STEP 1: Estimate the regression model by OLS and compute its estimated residuals, et . STEP 2: Regress et against a constant, xt1 , , xtk , and et1 , using the T 1 observations 2 through T . Then the LM statistic can be calculated by 2 2 (T 1)Re , where Re is the R-squared from the auxiliary regression. T 1 is used because the ecient number of observations is T 1. STEP 3: Reject the null hypothesis of zero autocorrelation in favor of the 2 2 alternative that = 0 if (T 1)Re > 2 1,(1) , the value of 1 in the chi-square distribution with 1 d.f. such that the area to the right of it is 1 , and is the signicance level.

REMARKS: If there were serial correlation in the residuals, we would expect et to be related to et1 . This is the motivation behind the auxiliary regression in which et1 is included along with all the independent variables in the model. The LM test does not have the inconclusiveness of the DW test. However, the LM test is a large-sample test and would need at least 30 d.f. to be meaningful. c Yin-Feng Gau 2002 ECONOMETRICS




Treatment of Serial Correlation

Model Formulation in First Dierences

Granger and Newbold (1974 and 1976) have cautioned against spurious regressions that might arise when a regression is based on levels of trending variables, especially when a signicant DW statistic. A common way to get around this problem is to formulate models in terms of rst dierence which is the dierence between the value at time t and at time t 1. That is, we estimate yt = xt + ut where yt = yt yt1 and xt = xt xt1 . However, the solution of using rst dierences might not always be appropriate. The rst dierence model can be rewritten as yt = yt1 + 1 xt 1 xt1 + ut


Estimation Procedures

When modied function forms do not eliminate autocorrelation, several estimation procedures are available that will produce more ecient estimates than those obtained by OLS procedure. These methods need to be applied only for time series data. With cross-section data one can rearrange the observations in any manner and get a DW statistic that is acceptable. REMARKS: The DW test is meaningless for cross-section data because one can rearrange the observations in any manner and get a DW statistic that is acceptable. Because time series data cannot be rearranged, one needs to be concerned about possible serial correlation. Some usual procedure for estimating models with AR(1) serial correlation are listed below. Cochrane-Orcutt (CORC) Iterative Procedure Cochrane and Orcutt (1949): This procedure requires the transformation of the regression model to a form in which the OLS procedure is applicable. c Yin-Feng Gau 2002 ECONOMETRICS




Quasi-dierencing or generalized dierencing transformation: generate variables y and x . Rewrite the model for the period t 1 we get yt1 = 0 + 1 xt1,1 + 2 xt1,2 + + k xt1,k + t1 Multiplying by and subtracting from the original equation, we obtain yt yt1 = 0 (1 ) + 1 [xt1 xt1,1 ] + 2 [xt2 xt1,2 ] + + + k [xtk xt1,k ] + ut where we have used the fact that t = t1 + ut . Rewrite this equation again, yt = 0 + 1 x (10.1) t1 + 2 xt2 + + k xtk + ut where
yt = yt yt1 , 0 = 0 (1 ),

and x ti = xti xt1,i ,

for t = 2, 3, . . . , T and i = 1, . . . , k . Note that the error term satises all the properties needed for applying the OLS procedure. If were known, we could apply OLS to the transformed y and x and obtain estimates that are BLUE. However, is unknown and has to be estimated from the sample. Steps for Carrying Out the ORCR Procedure: STEP 1: Estimate the original equation by OLS and compute its residuals et . STEP 2: Estimate the rst-order serial correlation coecient ( ) by regressing et against et1 . STEP 3: Transform the variables as follows:
yt = yt yt1 ,

x xt1,1 , t1 = xt1

and so on

STEP 4: Regress yt against a constant, x t1 , xt2 , . . ., xtk and get OLS esti mate of j , j = 0, 1, , k . and estimated as 0 /(1 ). Plug 0 STEP 5: Derive estimates for the 0 j , j = 1, , k into the original regression, and then obtain a new set of estimates for t . Then go back and repeat Step 2 with these new values until the following stopping rules applies.

c Yin-Feng Gau 2002





STEP 6: This iterative procedure can be stopped when the estimates of from two successive iterations dier by no more than some preselected values, such as 0.001. the nal is then used to get the CORC estimates for transformed regression. Hildreth-Lu (HILU) Search Procedure Steps of Hildreth and Lu (1960) Procedure: STEP 1: Choose a value of (say 1 ). Using this value, transform the variables and estimate the transformed regression by OLS. STEP 2: From these estimates, derive u t from Equation (10.1) and the error sum of squares associated with it. Call it SSRu (1 ), Nest choose a dierent (2 ) and repeat Steps 1 and 2. STEP 3: By varying from 1 to +1 in some systematic way (say, at steps of length 0.05 or 0.01), we can get a series of values of SSRu (). Choose that for which SSRu is a minimum. This is the nal that globally minimizes the error sum of squares of the transformed model. Equation (10.1) is then estimated with the nal as the optimum solution. A Comparison of the Two Procedures THE HILU procedure is basically searches for the values of between 1 and +1 that minimizes the sum of squares of the residuals of the transformed equation. If the step intervals are small, the procedure involves running a large number if regressions; hence, compared to the CORC procedure, the HILU method is computer intensive. The CORC procedure iterates to a local minimum of SSR() and might miss the global minimum if there is more than one local minimum.


High-Order Serial Correlation

yt = 0 + 1 xt1 + 2 xt2 + + k xtk + t t = 1 t1 + 2 t2 + 3 t3 + + p tp + ut

AR(p): pth-order autoregressive process of the residuals

c Yin-Feng Gau 2002






Lagrange Multiplier Test of Higher-Order Autocorrelation

Combine the above two equations, we obtain yt = 0 + 1 xt1 + + k xtk + 1 t1 + 2 t2 + + p tp + ut The null hypothesis is that each of the s is zero (that is, 1 = 2 = = p = 0) against the alternative that at least one of them is not zero. The null hypothesis is very similar to the one for testing the addition of new variables. In this case, the new variables are t1 , t2 , . . . , tp , which can be estimated by et1 , et2 , . . ., etp . Steps of LM test for higher order AR(p) serial correlation: STEP 1 Estimate the original regression by OLS and obtain the residuals et . STEP 2 Regress et against all the independent variables x1 , , xk plus et1 , et2 , , etp . The eective number of observations used in the auxiliary regression is T p because t p is denes only for the period p + 1 to T .
2 STEP 3 Compute (T p)Re from the auxiliary regression run in Step 2. 2 If it exceeds p,1 , the value of the chi-square distribution with p d.f. such that the area to the right is 1 , then reject H0 : 1 = 2 = = p = 0 in favor of H1 : at least one of the s is signicantly dierent from zero.


Estimating a Model with General Order Autoregressive Errors

If the LM test rejects the null hypothesis of no serial correlation, we must estimate eciently the parameters of the below equation with pth-order autoregressive error: yt 1 yt1 2 yt2 3 yt3 p ytp = 0 (1 1 2 p ) + 1 [xt1 1 xt1,1 p xtp,1 ] + + k [xtk 1 xt1,k p xtp,k ] + ut c Yin-Feng Gau 2002 ECONOMETRICS




STEP 1 Estimate the original regression model by OLS and obtain the residuals, et . STEP 2 Regress et against et1 , et2 , . . . , etp (with no constant term) to obtain the estimates 1 , 2 , and so on of the parameters of etj , j = 1, 2, , p. Here only T p observations are used. STEP 3 Using these estimates of j , j = 1, , k , transform y and xs to get the dependent and independent variables to get the new transformed variables y and x s as yt = yt 1 yt1 p ytp and x ti = xti 1 xt1,i p xtp,i , i = 1, 2, , k .
STEP 4 Estimate the transformed model by regressing yt against x ti , i = 1, 2, , k . Using the estimate 0 , we calculate 0 = 0 /(1 1 2 p ).

STEP 5 Then plug this 0 along with estimates of j , j = 1, 2, , k into the original regression to compute a revised estimate of the residuals t . Then go back to Step 2 and iterate until some criterion is satised.

REMARKS: The nal s obtained in Step 5 can then be used to make one last transformation of the data to estimate s.


Forecasts and Goodness of Fit in AR Models

The predicted yt is: 0 + 1 xt1 + + k xtk + y t = 1 t1 + 2 t2 + + p tp The forecast of yt obtained this way will be more ecient than the OLS prediction that ignores the e terms.


Engles ARCH Test

The variance of prediction errors is not a constant but diers from period to period. For instance, free-oating exchange rates uctuate a great deal, making their variances larger. Increased volatility of security prices are often indicators that the variances are not constant over time. c Yin-Feng Gau 2002 ECONOMETRICS




Engle (1982) introduced a new approach to modelling heteroskedasticity in time series context. For the regression model, yt = 0 + 1 xt1 + 2 xt2 + + k xtk + t
2 where Var(t |Ft1 = t .

ARCH (autoregressive conditional heteroskedasticity) model: Specify the conditional variance process as follows:
2 2 2 t = 0 + 1 t 1 + + p tp + vt

This conditional variance process is denoted as the pth order ARCH, ARCH(p), process. The conditional variance at time t depends on those in previous periods and hence the term conditional heteroskedasticity. ARCH test: this is test for the null hypothesis H0 : 1 = 2 = = p = 0. STEP 1 Estimate s in the original equation by OLS. 0 1 xt1 2 xt2 k xtk , STEP 2 Compute the residual et = yt 2 2 2 2 square it to obtain et , and generate et1 , et2 , , etp .
2 2 2 STEP 3 Regress e2 t against a constant, et1 , et2 , . . ., and etp . This is the auxiliary regression, which uses T p observations. 2 STEP 4 Using Re 2 , the R-squared of the auxiliary regression, we compute 2 LM=(T p)Re 2 . Under the null hypothesis H0 : 1 = 2 = = p = 2 0, (T p)Re2 has the chi-square distribution with p d.f. Reject H0 if 2 2 2 (T p)Re 2 > p,1 , the point on p with an area 1 to the right of it.

