You are on page 1of 25

FINANCIAL ECONOMETRICS AND EMPIRICAL FINANCE - MODULE 2

General Exam Solutions - July 2012


Time Allowed: 100 Minutes
Family Name (Surname)

First Name

Student Number (Matr.)

Please answer all the questions by choosing the most appropriate alternative(s) or by
writing your answers in the spaces provided. You always need to carefully justify and
show your work in the case of open questions. There might be more than one correct
answer(s) for each of the multiple choice questions. Each selected alternative that is a
correct answer will be awarded one point. Wrong answers will be penalized with minus
0.5 point. Correct answers not selected and questions that have been left blank will receive
zero points. Only answers explicitly reported in the appropriate box will be considered.
In the multiple choice case, report your selection by writing one or more of the letters A,
B, C, D, ..., M, in BLOCK CAPITAL LETTERS. No other answers on the exam paper
or indication pointing to potential answers will be taken into consideration.

Section 1
Question 1.1
What does the following graph report?

0.16
Country 1
Country 2
0.14

0.12

0.1

0.08

0.06

0.04

0.02

0
1973

1975

1977

1979

1981

1983

1985

1987

1989

1991

1993

1995

1997

1999

2001

2003

2005

(A) UK and US 3-year total, cumulative stock market returns.


(B) UK and German dividend-price ratios.
(C) The yields to maturity on Italian and UK 10-year Government Bonds.
(D) The yields to maturity on US and German 10-year Government Bonds.
1

2007

2009

2011

(E) Two time series defined in nominal terms.


(F) The annual Consumer Price Index inflation rates in Germany and France.
Answer(s)
D, E
Question 1.2
Consider the following model for two time series, and :
= 051 + + 1
"

= 051 + 2
#
" # "
#!
1
0
1 0
IID

2
0
0 1

Indicate which of the following statements is/are correct:


(A) The time series is weakly stationary.
(B) The unconditional variance of is larger than the conditional one-step ahead variance.
(C) The path of forecasts for + conditional on the information available at time will be oscillatory
but it will eventually converge to 0.1.
(D) If a sample of 1000 observations were simulated from this model for , then only at most five
values will be larger than 6.
(E) The upper bound for the 95 percent conditional 1-step ahead confidence interval for is 2.
(F) The series and share a common stochastic trend.
(G) The series and have the same unconditional mean.
(H) If = 1 = 1 then [+1 | ] = 1.

(I) If = 1 = 1 1 = 1 and 2 = 1 then [+1 | ] 6= 1.

(L) If = 0 = 08then [+1 | ] = 08

Answer(s)
A, B, G, H

Debriefing:
(A) True the unconditional moments of do not depend on time.
(B) True, as the process is stationary with some persistence.
(C) False, the path of forecasts will converge monotocally to the unconditional mean, that is zero
since the intercept is zero
(D) The answer relies on computing the 99.5% quantile (because 5/1000=0.5%),
r
p
10
= 274 6
[ ] + 05% [ ] = 0 + 26
9
2

which establishes that even on average only 5 simulated observations will be larger than 2.74, not 6. This
implies that much less than 5 observations will exceed 6. Moreover, notice that in each single simulation
path (i.e., if one removes the average clause used in the answer above) anything can happen so that
in any event it is incorrect to state that at most five values will be larger than 6.
p
(E) Incorrect, it is approximately [+1 | ] + 2 [+1 | ] = 05 + 05 + 2 which will depend

on and and is in general dierent from 2 unless = = 0.

(F) Incorrect, because both and are stationary, they have no stochastic trend.
(G) Correct, the unconditional mean of both series is zero.
(H) Correct, as [+1 | ] = 05 + 05 = 1

(I) Incorrect, because any information on the residuals is already included in , so that the correct

answer is in already in (H).


(L) Incorrect, [+1 | ] = 05 + 05 = 04.
Question 1.3
A researcher is using a Gaussian GARCH(1,1) model to forecast the variance of stock returns:
+1 = +1 +1

+1 IID (0 1)

2+1 = + 2 + 2
Indicate which of the following is/are correct?
(A) Because under a GARCH(1,1) model the forecast of variance changes over time, then the distribution of stock returns will be non-stationary.
(B) In spite of the fact that under a GARCH(1,1) model the forecast of variance changes over time,
the distribution for stock returns may still be stationary if the condition + 1 is satisfied.
(C) Because under a GARCH(1,1) model the forecast of variance changes over time, in spite of the
fact that +1 IID (0 1), a GARCH model may generate pervasive non-stationarities in stock returns
independently of the values taken by the parameters and .
(D) In spite of the fact that under a GARCH(1,1) model the forecast of variance changes over time,
the distribution for stock returns may still be stationary if the condition + 1 is satisfied.

(E) Because under a GARCH(1,1) model the forecast of variance changes over time, in spite of the

fact that +1 IID (0 1), a GARCH model may generate pervasive non-normalities in stock returns.
(F) None of the above.
Answer(s)
B, E
Debriefing:
(A) This is incorrect: we know that time-varying GARCH variance needs not imply a non-stationary
distribution for returns, unless of course either the mean or the variance process is itself non-stationary,
which is not implied or suggested by the question.
(B) Obviously correct, see lecture 1, second part of the course.
3

(C) This is the same as A, just stated in a more convoluted and complicated way; in any event, this
answer remains incorrect.
(D) Absurd, this answer actually simply states the opposite of what is true.
(E) That is correct, as discussed in the lectures: in spite of the fact that +1 IID (0 1), when +1
follows a GARCH process, its time variation induces non-normalities in stock returns as a result of the
fact that +1 +1 , a product of two random variables, becomes itself time-varying.
(F) Because B and E are correct, E cannot be correct.
Question 1.4
Consider the Cornish-Fisher approximation for the inverse CDF function in correspondence to a
critical value 1 :
1 = 1
+


21

1 1 2
3
1
3
1

,
( ) 1 + 2 (1
2(1
) 3
) 5
6
24
36

where 1 indicates the sample skewness coecient, 2 is the sample excess kurtosis, and 1
is the
inverse Gaussian CDF in correspondence to a critical value . With reference to a time series of S&P
500 returns for which at time the forecast of time + 1 volatility is +1 = 235%, skewness is -0.68,
and the time the forecast of time + 1 mean is +1 = 013%, a colleague of yours has stated the she
has just computed a Cornish-Fisher 1% VaR of 9.14%. However, your colleague has forgotten to state
what the kurtosis of S&P 500 returns is in her data. Note that 1
001 = 2326.
Please indicate which of the following statements is/are correct.
(A) The Cornish-Fisher 1% VaR estimate as well as the information provided on the properties of
S&P 500 returns imply that their excess kurtosis is approximately -3.336.
(B) The Cornish-Fisher 1% VaR estimate as well as the information provided on the properties of
S&P 500 returns imply that their kurtosis is approximately 5.526.
(C) The Cornish-Fisher 1% VaR estimate as well as the information provided on the properties of
S&P 500 returns imply that their excess kurtosis is approximately 8.526.
(D) The Cornish-Fisher 1% VaR estimate as well as the information provided on the properties of
S&P 500 returns imply that their kurtosis is approximately 1.336.
(E) None of the above.
Answer
E
Debriefing. The calculation is straightforward: if 001 = 914%, +1 = 235%, +1 = 013%
and 001 = +1 1 +1 then
1 =

001 + +1
914 + 013
= 3945
=
+1
235

Therefore it must be that


(068)2


068 1 2
3
1
3
1

( ) 1 + 2 (1
2(1
)

) 5
6
24
36
2

068
2
3
= 2326
(2326) 1 +
(2326) 3 2326 +
6
24
2

(068)

2(2326)3 5 2326
36
= 2326 05 2 0237 + 0174

3945 = 1
+

which implies
1293 = 2 0237 = 2 =

1293
= 5526
0237

excess kurtosis or a total kurtosis of 8.526 (remember that 2 is the coecient of excess kurtosis, not
of total kurtosis). Therefore A is incorrect becauseas you should knowexcess kurtosis can never be
below -3 (being a positive number minus 3). B is incorrect because it states that kurtosis ought to be
5.526, not excess kurtosis. C is incorrect because it states that excess kurtosis ought to be 8.526, not
kurtosis. Answer D is simply o and unrelated to the question. Therefore answer E has to be correct.
Question 1.5
You have just given a presentation to the board of directors of your institution in which you have
presented results for a number of predicted, back-tested (i.e., recursively computed over time) VaR measures (with their level set to range between 1 and 20%) for the overall portfolio held by your institution,
over a 10-year period. Such recursive VaR measures have been computed under five alternative models:
(i) a naive, homoskedastic Gaussian IID model with zero mean returns; (ii) a plain-vanilla Gaussian
GARCH(1,1) model with zero mean returns; (iii) a plain-vanilla Gaussian GARCH(1,1) model with
estimated mean returns,
; (iv) a -Student GARCH(1,1) model with estimated mean returns,
; (v) a
DCC-GARCH(1,1) model estimated for the vector of returns on the main asset classes in the portfolio
of your bank. Note that models (i)-(iv) are estimated directly on the series of portfolio returns, i.e.,
these are passive risk management models, while (v) may be implemented as an active model.
A colleague of yours has criticized your results and in particular he/she has pointed out three alleged
inconsistencies that would aect your simulations/backtesting exercises:
1 There are long time periods in which the Gaussian IID VaR [model (i)] exceeds the VaR yielded
by the other models [(ii)-(v)] and this cannot be, because (so he/she says) we all know that fancier and
more complex econometric models ought to make us always more prudent in risk management terms.
2 There are time periods in which the -Student GARCH(1,1) [model (iv)] yields VaR measures
that fail to exceed the Gaussian GARCH(1,1) measures [model (iii)] and this cannot be, because (so
he/she says) we all know that modelling shocks as -Student shocks cannot but increase VaR, thus
making us more prudent.
3 There are time periods in which the plain-vanilla GARCH(1,1) model with zero mean returns
[model (ii)] yields VaR measures exceeding the Gaussian GARCH(1,1) measures with estimated mean
[model (iii)] and this cannot be, because (so he/she says) we all know that imposing a zero mean will
always reduce VaR measures, thus making us less prudent.
5

Your colleague asks that you be fired: on which counts is she correct, if any?
(A) 1 only.
(B) 1 and 3.
(C) 2 and 3.
(D) 3 only.
(E) 1 and 2.
(F) None.
Answer(s)
F
Debriefing. You should not be fired; maybe your fussy colleague should (as a minimum she
should be sent to financial econometrics bootcamp on the first occasion, as he/she has made a fool of
himself/herself with a sequence of ridiculous and generally incorrect claims):
1 There are long time periods in which the Gaussian IID VaR [model i)] exceeds the VaR yielded
by the other models [(ii)-(v)] and this cannot be, because (so he/she says) we all know that fancier and
more complex econometric modelling ought to make us always more prudent in risk management terms.
NO, as we have seen in lab 2 (second part of the course), this can of course happen when time-varying
GARCH predicted variance is below the unconditional, constant variance that model (i) would use:

2 There are time periods in which the -Student GARCH(1,1) [model (iv)] yields VaR measures
that fail to exceed the Gaussian GARCH(1,1) measures [model (iii)] and this cannot be, because we all
know (so he/she says) that modelling shocks as -Student shocks cannot but increase VaR measures thus
making us more prudent. NO: it all depends on the level of the percent VaR to be computed. As we
have seen in class, while it is typical to expect that for low VaR levels, a -Student assumption will
6

normally inflate the risk exposure assessment, as grows (and here we are even entertaining a as high
as 20%), this may not be the case. For instance, we had one such example in Review Set 2, question 4a.
3 There are time periods in which the plain-vanilla GARCH(1,1) model with zero mean returns
[model (ii)] yields VaR measures exceeding the Gaussian GARCH(1,1) measures with estimated mean
[model (iii)] and this cannot be, because we all know (so he/she says) that imposing a zero mean will
always reduce VaR measures, thus making us less prudent. NO, it obviously depends on the sign of
:
if you incorrectly impose
= 0 when
0 this will increase the VaR measures for whatever choice of
.
Question 1.6
Suppose you are given high-frequency returns data on two stocks, IBM Inc. and Notsoliquid Ltd.
(below also shortened as Notso). However, while IBM shares trade very frequently, so that 1-minute
returns data are available and result from actual trades, Notsoliquid shares trade only 3-4 times a day.
Your goal is to compute a realized covariance matrix for IBM and Notsoliquid stocks, using returns
simultaneously sampled at a 1-minute frequency for both stocks, defined as + + +(1)
for = 1 .

Please indicate which of the following statements is/are correct:


(A) Assuming the means may be approximated as being nil, one formula of realized covariance that
p
p
P

( + 1) =
+ + .
you may use is
=1

(B) Assuming the means may be approximated as being nil, one formula of realized covariance that
P

( + 1) =
you may use is
=1 + + .
(C) Because the prices of IBM and Notsoliquid stocks are not observed asynchronously (i.e., every

trade on IBM does not come with a corresponding trade for Notsoliquid), then the resulting realized
P

( + 1) =
covariance matrix obtained using
=1 + + may not be
uniformly stationary.

(D) Because the prices of IBM and Notsoliquid stocks are not observed synchronously (i.e., every
trade on IBM does not come with a corresponding trade for Notsoliquid), then the resulting realized
P

( + 1) =
covariance matrix obtained using
=1 + + may not be
positive definite.

(E) Assuming the means may be approximated as being nil, the implied realized correlation may be
computed as

( + 1)

=1 + +
qP

2
2

=1 +
=1 +

= qP

but because the prices of IBM and Notsoliquid stocks are not observed synchronously (i.e., every trade
on IBM does not come with a corresponding trade for Notsoliquid), the resulting realized correlation is
not guaranteed to fall in the interval [-1,1].
Answer
B, D, E

Debriefing:
(A) False, see the definition of sample covariance when sample means are zero.
(B) Correct, see lecture 3, slide 12.
(C) False, this is just a twisted, incorrect version of (A) because it uses asynchronously instead
of synchronously and because it makes no sense to discuss of the stationarity of a covariance matrix;
finally, uniform stationarity is a made-up concept.
(D) Correct, see lecture 3, slide 12.
(E) Correct, see lecture 3, slide 12; moreover, notice that in a bivariate covariance matrix, for
the matrix not to be positive definite and for the implied correlation not to fall in the interval [-1,1]
are the same; notice that this point is not entirely trivial and applies to this question only because
P
P
2
2
=1 + 0 and
=1 + 0 make it in such a way that the only reason for a

( + 1) 6 [1 1]
realized covariance matrix not to be positive definite is that

Question 1.7
You have estimated a full BEKK GARCH(1,1) with VAR(1) conditional mean function model for
]0 ,
Japanese and US stock returns, R+1 [+1
+1

R+1 = + R + z+1

z+1 IID (0 +1 )

+1 = CC0 + AR R0 A + B+1 B0 ,
with A, B, and C non-negative and symmetric, that is characterized by 9 parameters in the BEKK
GARCH(1,1) part. You know that the model has been estimated by maximum likelihood, and yielded a
maximized log-likelihood function of approximately 453.798, with a resulting Hannan-Quinn information
criterion that has equalled approximately -2.851.
Indicate which of the following statements is/are correct:
(A) Because the bivariate VAR(1) conditional mean implies a need to estimate 6 parameters in
addition to the parameters in the GARCH BEKK model, the formula for the Hannan-Quinn information
criterion implies that the maximum likelihood estimation must have employed 300 monthly observations.
(B) Because the bivariate VAR(1) conditional mean implies a need to estimate 6 parameters inclusive
of the parameters in the GARCH BEKK model, the formula for the Hannan-Quinn information criterion
implies that the maximized log-likelihood function must have been 453.798, while the saturation ratio
for this model is 16.
(C) Because the bivariate VAR(1) conditional mean implies a need to estimate 6 parameters in
addition to the parameters in the GARCH BEKK model, the formula for the Hannan-Quinn information
criterion implies that the maximum likelihood estimation must have employed 120 monthly observations
in total.
(D) The formula for the Hannan-Quinn information criterion implies that the saturation ratio for
this model is 28.
(E) Because the bivariate VAR(1) conditional mean implies a need to estimate 6 parameters in
addition to the parameters in the GARCH BEKK model, the formula for the Hannan-Quinn information
criterion implies that the saturation ratio for this model is exactly 20.
8

(F) None of the above.


Answer
A, E
Debriefing:
(A) Correct, because in a VAR(1) model there are 4 parameters in and 2 in , for a total of 6,
we know that the Hannan-Quinn criterion is computed as

ln(ln )
- = 2()
+ 2num()
which implies

ln(ln ) = 1 (-) + 2()


num()
2
This equation has no explicit solution, but if you plug the numbers in, you will see that
2851 = 2

453798
+ 2 15 ln(ln 300)300
300

is the total number of parameters in the model, which establishes that = 300
where num()
(B) False and dumb because it merely replicates the answer to a similar question in the June exam
without noticing that the H-Q has changed from 3569 to 2.851 and that what the question aims for
is actually dierent. Also the portion concerning inclusive of the parameters in the GARCH BEKK

model is clearly wrong.


(C) False and dumb because it merely replicates the answer to a similar question in the June exam
without noticing that the H-Q has changed from 3569 to 2.851 and that what the question aims for
is actually dierent.

(D) False, see answer to (E) below.


(E) Given the result in A, the saturation ratio is defined as
300
Total number of observations in the model
= 20
=

15
num()
(F) False, because we have seen that answers (B) and (E) are correct.
Question 1.8
A research analyst in your team has just estimated on monthly data for a sample January 1972 December 2011 a three-state, unobservable Markov switching model for a trivariate system that collects
US, UK, and German stock returns, [ ]0 . He/she has then used data up to January 2011
to assess that the probability that February 2011 was characterized as an occurrence of regime 1 (a bear
1
= 0439, where 01/11 refers to
state of negative expected stock returns) and that it equalled
0211|0111

January 2011 and 02/11 to February 2011.


Please indicate which of the following statements is/are correct:
1
(A) 0211|0111 = 0439 is the filtered probability of February 2011 that has presumably been com-

puted using Hamiltons algorithm, that exploits Bayes rule.


9

1
(B) 0211|0111 = 0439 is the unconditional probability of February 2011 that has presumably been

computed using Hamiltons algorithm, that exploits Bayes rule.


1
(C) 0211|0111 = 0439 is the ergodic probability of February 2011 that has presumably been computed using Hamiltons algorithm, that exploits Bayes rule.
(D) Your analyst must have made some mistake because it is well-known that when Markov switching
models are estimated assuming that the Markov state variable is unobservable, it will then be impossible
to ever observe them or make any type of probabilistic statements or claims on them.
(E) None of the above is correct.

Answer
A

Debriefing:
(A) Correct, because the filtered probability 0211|0111 has used information (data) up to January
2011 and not also data for the subsequent February 2011 - December 2011 period, as a smoothed
probability would require.
(B) False, because the unconditional probabilities pertain to the regimes and not to individual dates;
moreover, the unconditional probabilities are constant and will not be a function of time.
(C) False, becauseas we have also seen in the Review Question Set 5the ergodic probabilities
are the same thing as the unconditional probabilities and therefore will be constant and not a function
of time.
(D) False, as explained in the lectures, because even though is latent we can (usually, must) make
inferences on it in the light of the available data.
(E) False because (A) is correct.
Question 1.9
Consider a simple RiskMetrics model for dynamic variance:
2+1 = 2 + (1 )2

(0 1)

The key advantage(s) of such a model when forecasting the variance is/are:
(A) Because the RiskMetrics model may also be re-written as 2+1 = (1 )

+1 2
+1+ ,
=1

recent returns matter less for tomorrows variance than distant returns do as is less than 1, and
therefore their weight gets larger when the lag, , gets bigger, which is sensible.
(B) It only contains one unknown parameter, .
(C) Because the RiskMetrics model may also be re-written as 2+1 = (1 )

1 2
+1 ,
=1

recent returns matter more for tomorrows variance than distant returns do as is less than 1, and
therefore their weight gets smaller when the lag, , gets bigger, which is sensible.
(D) It is nonstationary and it implies that long-run, ergodic variance does not exist, which is beneficial in applications.
10

(E) Little data need to be stored in order to calculate tomorrows variance; in fact, after including 100
lags of squared returns, the cumulated weight is usually close to 100%; of course, once 2 is calculated,
past returns are not needed.
(F) It can be shown that the -step ahead forecast of future variance for any 1 equals current

variance, [ 2+ ] = 2 , which is sensible.

Answer
B, C, E

Debriefing:
(A) False, because it is not true that the RiskMetrics model may be re-written as 2+1 = (1
P
+1 2
+1+ , see answer (C) below and the lecture notes. Moreover, under a RiskMetrics,
)
=1

recent returns matter more (not less) for tomorrows variance than distant returns do as is less than
1 and therefore gets smaller when the lag, , gets bigger.
(B) Correct, obviously.
(C) Correct, see lecture 1 (second part), slide 15.
(D) It is true that a RiskMetrics model is nonstationary and it implies that long-run, ergodic variance
does not even exist, but this not an advantage, at all.
(E) Correct, see lecture 1 (second part), slide 15.
(F) It is true that it can be shown that the -step ahead forecast of future variance for any 1

equals current variance, [ 2+ ] = 2 , but this is hardly an advantage because it counterfactually


means that the variance filtered today from any model is predicted to never change again in the future.

Section 2
Question 2.1
You are asked to construct a portfolio by using 5 electronically traded funds (ETFs) that replicate
the indexes of the US, Japan, UK, Euro area, and the Brazilian stock markets. Your benchmark for
evaluating the resulting portfolio is the MSCI world index. You have to choose the optimal weights to
maximize the Sharpe ratio of your portfolio with an investing horizon of three years. Given your choice
of the weights in your portfolio, you should then compute the daily, weekly and monthly 1 percent value
Value-at-Risk (VaR) of your portfolio. You have access to 30 years of data on the fundamentals (the
earning/price and the dividend price ratios) underlying stock index prices and to returns on the relevant
ETFs, all at daily frequencies (and therefore at any frequency lower than the daily one).
Answer the following questions justifying your answers.
2.1(a) Explain how you would proceed to construct your portfolio. State all the assumptions that
you think are appropriate.
Answer. Construction of a Sharpe-ratio optimizing portfoliowhich is the same as a mean-variance
portfoliorequires an estimate of expected returns over a three-year horizon and of their variancecovariance. Econometric modelling using predictors for the first moments and assuming constant second
11

moments is one easy but not necessarily optimal route. Given these predicted moments, an optimization
procedure can be implemented using the maximization of the Sharpe-ratio as the criterion to choose
weights. Restrictions on weights can be imposed at the optimization stage. Weights can also be obtained
by mixing a view-based on model prediction with weights based on the capitalization of the dierent
markets (Black-Littermann).
2.1(b) What are the main limitations of simply using historical, backward-looking moments (i.e.,
sample means, variances, covariances, etc.) to select optimal portfolio weights?
Answer. In general, historical moments do not capture time-variation in the (predicted) joint
density of asset returns. For instance, if there were any evidence of regimes in the process followed by
asset returns, these would have been systematically ignored. In fact, our forecasts would be based on the
assumption that returns would stay in the same regime in the future (i.e. that their means, variances,
and covariances will not change with respect to their long-term, sample levels. If this is not the case, for
instance when we observe a technological shift or a change in the structure of an economy, our forecasts,
and thus our optimal weights, will be incorrect. Moreover, in this case, historical moments would be
estimated on a very short estimation sample (only 10 observations, given 30 years of data), so that
an additional problem may emerge: estimation error. In fact, in this case we should be considerably
worried about the reliability of our estimation results.1
2.1(c) Do you need to apply any multivariate GARCH techniques to perform VaR calculations or
would you just use univariate GARCH modelling tools? Explain the costs and benefits of these choices.
Answer. It all depends on the task. On the one hand, one may argue that once you have found
optimal weights sub question (a), you can also model the VaR of the portfolio directly, because the
weights will not then change until the end of your investment horizon. In this case, you could apply
directly univariate GARCH techniques to realized portfolio returns (for given, optimal weights), which
is computationally easy, even though this corresponds to a passive risk management choice. On the
other hand, should you foresee a need to change your portfolio weights in the future, or in case you
wanted to understand precisely the contribution of each ETF to the overall risk management position
of your portfolio, you would then need to apply multivariateand therefore activeGARCH tecniques.
2.1(d) What may be the relevance of Markov Switching modelling techniques when performing the
task described in this question?
Answer. Markov Switching can capture regime switches in the dynamic properties/distribution of
ETF returns. For instance, the dividend-yield may be subject to regimesv due to a structural break
in its long-term mean around 1990. MS could better to forecast the probability of subsequent regime
switches and to model the past structural break in the series. Or, for example, it could be used to
1

Notice that computing the 3-year horizon optimal portfolio woulf require estimates of expected returns and of their

covariance matrix using three-year cumulative returns.

12

capture regime switching in the variance (high variance vs. low variance), especially when performing
VaR computations at high-frequencies.
2.1(e) Would you expect the monthly returns of your portfolio to be cointegrated with those of the
benchmark referred to in the question? Make sure to explain your reasoning.
Answer. The monthly returns of the portfolio are stationary because stock or index returns are
stationary, thus they cannot be cointegrated with anything. However, the value (price) of the portfolio
may (one, might push it say, that it must ) be cointegrated with that of the benchmark, because the
benchmark is approximately a value-weighted version of the portfolio itself.
2.1(f) How would you compute the relevant VaR measures the question asks for? State all the
assumptions that you think are appropriate.
Answer. When VaR is computed at relatively high frequencies (such as weekly or daily), then
time-variation should be allowed in conditional variances and covariances and a (multivariate?) GARCH
approach can be adopted. So specification, estimation, and backtesting of the relevant GARCH specification should be implemented prior to using the model for risk measurement. If VaR were to be
computed at monthly frequencies, it is then more ambiguous what the optimal approach ought to be,
even though in the lab sessions we have also experimented with VaRs applied to monthly frequencies,
with some success. Another issue, similar to what has been discussed above, is whether MS techniques
ought to be applied.
As far as the computation of VaR measures at dierent horizons is concerned, dierent strategies
can be adopted:
Compute returns for the portfolio at daily, weekly and monthly horizons to compute historical

moments, forecast mean and variances for the next day/month/year, assume a Normal distribution
for the returns, and find the 1% quantile by computing its inverse.

Compute returns at daily frequency, then scale the VaR assuming no serial correlation between
returns on dierent days, constant variance and normal distribution:

= 5

= 22

Compute returns at daily frequency, then simulate multi-period VaR through Monte-Carlo sim-

ulations. To do this, you must assume a distribution for which the daily shocks are drawn with
replacement. Then, assume a model for the variance over time. The top performer model seen in
class has been a N-GARCH or a EGARCH (1,1). At daily frequency, you can set the conditional
and unconditional mean to zero. The monthly and yearly VaR are computed through recursive
simulation. The assumed distribution for the shocks can be found used bootstrapping; that is,
extrapolating the empirical shocks from the historical data. Although more complex, probably
the use of a Monte-Carlo simulation is the best choice.

13

2.1(g) Suppose that the daily VaR measures computed on your portfolio is 3.8 percent. Does this
value of the 1% VaR rule out the possibility of making a loss of 50 per cent or more in any given month?
Explain your answer.
Answer. A note on language: for whatever 1% VaR measure at horizon , clearly any loss at
horizon remains possibile. However, any such loss exceeding the 1% VaR measure can occur with a
probability of 1% or less, i.e., it is extremely unlikely. Yet, such a large loss remains possible, technically
speaking. Many students may have simply replied in this way. If one interprets possible as likely, then
we can move and make some calculations that, as usual, will depend on the assumptions one makes.
Assuming a parametric Normal VaR, and IID returns with constant variance and zero mean, the answer
to the question above is positive as
=

22 = 38% 22 = 1782%

which makes a loss of 50 percent or more look unlikely. However, we know that unconditional and
conditional returns distributions have fatter tails than the Normal at daily frequencies, and their variance
is not constant but time-varying. These two circumstances, especially if the VaR was computed in a
low volatility period, can lead to sharp increases in the VaR over a one month horizon, so we would not
be able to rule out the possibility of losses of 50% or more.
2.1(h) What are the main advantages of MATLAB over EXCEL to perform the tasks described in
the question?
Answer. Thinking about the answers given above, we can say that using MATLAB:
There is much more flexibility to changes in the assumptions on the model.
Time-series can be handled more easily for any operation, since you can perform algebraic operations on the entire time-series as a whole.

You can perform operations on a large group of series with a single line command.
You do not have to navigate over huge tables and matrices to perform the required tasks.
Optimization procedures and Maximum Likelihood estimation in Matlab are usually faster and
more ecient, in the sense that they often reach a solution even when Excel does not.

Multi-period simulations are faster and much easier to implement with Matlab than with Excel

(Excel should store the result of each simulation to get to the final result while Matlab stores only
requested results).

Question 2.2
Consider the NGARCH(1,1) model
+1 = +1 +1

with +1 (0 1)

2+1 = + ( )2 + 2
14

where (1 + 2 ) + 1 0, 0 and 0
2.2(a) Derive the unconditional variance for this NGARCH model. Make sure to emphasize what
your assumptions are.
Answer. Assuming the process is stationary, then call 2 = [ 2 ] = [ 2+1 ] As a result
[ 2+1 ] = 2 = + [ 2 ( )2 ] + [ 2 ]

= + [ 2 ][2 + 2 2 ] + [ 2 ]
= + 2 (1 + 2 ) + 2

Therefore
2 [1 (1 + 2 ) ] = = 2 =

1 (1 + 2 )

which is exists and positive if and only if (1 + 2 ) + 1 as the question ensures to be true.
2.2(b) Explain the nature of the N onlinearity in the N GARCH(1,1) model (the N in NGARCH),
i.e., that the NGARCH(1,1) may be re-written as a GARCH(1,1) model with time-varying coecients.
Answer. Also in this case, this is most easily seen by re-writing the model as:
2+1 = + 2 ( )2 + 2

= + 2 2 2 2 + (2 + ) 2
| {z } | {z }
| {z }
=

= + 2 + ( + 0 ) 2

This is indeed a GARCH(1,1) model in which the squared shock term maps into forecasts of volatility
on the basis of a coecient 2 that depends on one lag of NGARCH variance, and in which past

variance maps into forecasts of volatility on the basis of a coecient ( 2 + 0 ) 2 + 2 that

depends on one lag of NGARCH standardized innovations.

2.2(c) Assuming that 0, explain why a NGARCH model is not only nonlinear (see 2.2b) but also
asymmetric, i.e., the fact that it implies that the sign of shocks contributes to determine their impact on
forecasts of subsequent variance. [Hint: One idea is to re-write the model as 2+1 = + 2 ( )2 +2

and then find a way to decompose it using indicator variables, i.e. variables {} = 1 if condition

holds and 0 otherwise]

Answer. Also in this case, this is most easily seen by re-writing the model as:
2+1 = + 2 ( )2 + 2

15

At this point the model can be re-written as:


2+1 = + 2 ( )2 + 2

= + 2 2 2 2 + (2 + ) 2
| {z }
=

0
+ 2 2 + 0 2 2 2

= + 2 2 + 0 2 2{ 0} | | 2 + 2{ 0} | | 2
which shows that while a positive innovation has an eect on the forecast of variance that equals
2 2 2{ 0} | | 2 a negative innovation has a larger eect 2 2 + 2{ 0} | | 2 .
2.2(d) You have estimated the NGARCH(1,1) model on S&P 500 daily return data, and obtained
= 0953. What is the long-run, unconditional
QML estimates of
= 0003,
= 0062, = 0253, and
variance of the process?
Answer. Lets try and stick in estimates into the formula found above:

2 =

1
(1 +
)

0003
= 0158
1 0062[1 + (0253)2 ] 0953

! This cannot be: a variance can only be positive. In fact, in this case the stationarity condition
2
1 fails and as a result we should conclude that the NGARCH process is non-stationary

(1 + )

and, as a result, unconditional variance does NOT exist or, equivalently,


2 i.e., it explodes. In

this case the answer


2 = 0158 0 (or any othe negative number) should be penalized with a -0.5
score (this is indeed blasphemy).
Question 2.3

Consider the following estimated MSIVARH(2,1) model for the bivariate vector of US and Japanese
]0 where
stock percentage index returns, R+1 [+1
+1 = 1 2 is a first-order, two-state
+1

irreducible and ergodic Markov chain that characterizes returns in both markets:
#
"

+1
+1 R +
+1 +1
=
+1 +

+1
"
# "
#
#"

005
0

+1
=
+
+

011

+1
+1
"

1445
0

+1

,
+ q
2
1 (
)

+1 +1

+1
+1
+1

+1 is the Choleski factor of the (estimated) covariance matrix of the vector of shocks +1 ,
where

+1 (0 I2 ),
=
P

"

097 003
012 088

16

and

+1

+1

(
(

133

if +1 = 1

045 if +1 = 2
085 if +1 = 1
if +1 = 2

+1

+1 =

if +1 = 1

095

064 if +1 = 2

+1

045 if +1 = 1
0

if +1 = 2

1205 if +1 = 1
2484 if +1 = 2

2.3(a) Compute the ergodic probabilities of regimes 1 and 2. Make sure to show the formulas that
you have applied in your calculations. Compute, the long-run, unconditional probabilities of regimes 1
and 2. How are they dierent from the ergodic probabilities and if so why?
Answer. Using the same formulas seen in the lectures, we have:
1 =

1 22
1 088
= 08
=
1 11 22
2 097 088

so that 2 = 1 1 = 02 As seen in the classes and in Review Question set 5, ergodic and long-run
unconditional probability (the average frequency of the two regimes in the long-run) are identical by
definition.
2.3(b) Write the model in extensive form, i.e., without using vectors and matrices and simply by
and one statistical model (equation) for from
developing one statistical model (equation) for +1
+1

the estimates provided above. Make sure to disentangle what statistical model (equation) applies in
regime 1 and which one applies in regime 2, this for each of the two markets under consideration (i.e.,
the total is therefore 4, 2 markets 2 regimes).
Answer.

+1

+1

(
(

133

if +1 = 1

045 if +1 = 2
095

if +1 = 1

!
!

064 if +1 = 2
!
085 if +1 = 1

+ 005 + 1445
+1

045 if +1 = 1
0

if +1 = 2

1205 if +1 = 1

+ 011 +

+1 +

0
if +1 = 2
2484 if +1 = 2
! (
!
(
1205 if +1 = 1
1 0852 if +1 = 1
1

if +1 = 2

2484 if +1 = 2

+1

or

+1
=

=
+1

(
(

133 + 005 + 1445


+1

if +1 = 1

045 + 005 + 1445


+1 if +1 = 2

095 + 045 + 011 + 1024


+1 + 635+1 if +1 = 1

064 + 011 + 2484


+1

if +1 = 2

2.3(c) Compute the unconditional means of US and Japanese stock returns, respectively. Perform
the calculation by using the unconditional probabilities computed in question 2.3(b), i.e., exploit the
17

()

()

()

fact that [+1 ] = 1 [+1 |+1 = 1] + 2 [+1 |+1 = 2]. [Hint: Start from the standard way in

which you compute unconditional means for autoregressive processes, conditioning in this case on the
state of the Markov chain]

= 133 + 005 + 1445 and | =


Answer. In the case of the US, we know that +1

+1
+1

045 + 005 + 1445


+1 . Therefore:

|+1 = 1] =
[+1

133
= 1400
1 005

[+1
|+1 = 2] =

045
= 0474
1 005

] = 08 1409 + 02 (0474) = 1032


[+1
|

In the case of Japan, you need to exploit knowledge of [+1


+1 = 1] and [+1 |+1 = 2] to find

from the formula of an AR(1) process:

|
095 + 045[+1
095 + 045 1400
+1 = 1]
=
= 1775
1 011
089
064
= 0719
= 2] =
1 011

|+1 = 1] =
[+1

|+1
[+1

It follows that

] = 08 1775 + 02 (0719) = 1276


[+1

Section 3
Question 3.1
Consider the following MATLAB function:

function VaR=VaR compute(confidence level, mu, sigma)


VaR=NaN(size(mu));
for i=1:rows(VaR)
VaR(i)=norminv(1-confidence level,mu(i),sigma(i));
end

State which of the following statements is/are correct:


(A) The function is designed to compute the VaR of a number of dierent portfolios.
(B) The function computes the VaR of a portfolio with zero expected returns.
(C) The function cannot be used because there is a GARCH structure in second moments of the
relevant returns.
(D) The vector mu is always a scalar.
(E) Because sigma is a vector, if the function is used to compute VaR measures of several portfolios,
then the returns on these assets must have been assumed to be uncorrelated.
(F) The function assumes normality and homoskedasticity of returns.

18

Answer(s)
A
Debriefing:
(A) Correct, that is the meaning of the loop on rows(VaR); the number of portfolios is defined by
the size of the vector mu.
(B) No, because you do not know what the contents of the vector mu are.
(C) No, simply made upno GARCH structure in sight.
(D) No, unless rows(VaR)= 1 which does not have to be.
(E) No, because such correlations will be already reflected in the time series of portfolio returns;
under passive risk management it is not that correlations are ignored, however their specific role is not
transparent. Even if you ignore the composition of portfolios, the code uses at no point whether or not
these are correlated.
(F) No, in the sense that normality is obviously assumed, but homoskedasticity is not: even if the
vector sigma were to contain a vector of predicted 1-step ahead GARCH variances, the formula would
be the same (notice this needs to concern 1-step ahead predictions only).
Question 3.2
Consider the following lines of Matlab code:
- - - - - - - - - - - - - - - -- - - - - - - - par initial(1:4,1)=[0.05;0.1;0.05;0.85];
[param ng,l ng]=fminsearch(ngarch,par initial,[],port ret(first:last,:));
[mle,z ng,cond var ng]=ngarch(param ng,port ret(first:last,:));
fprintf(\n);
disp([NGARCH PARAMETERS]);
fprintf(omega ); fprintf(%5.4f ,param ng(1,1));fprintf(\n);
fprintf(alpha ); fprintf(%5.4f ,param ng(2,1));fprintf(\n);
fprintf(theta ); fprintf(%5.4f ,param ng(3,1));fprintf(\n);
fprintf(beta ); fprintf(%5.4f ,param ng(4,1));fprintf(\n);
fprintf(MaxLik ); fprintf(%5.4f ,l ng(1,1));fprintf(\n);
fprintf(Stationarity measure);
fprintf(%5.4f,param ng(2,1)*(1+(param ng(3,1)^2))+param ng(4,1));
fprintf(\n); fprintf(\n);
df init=10;
[df,q]=fminsearch(logL1,df init,[],port ret(first:last,:),sqrt(cond var ng));
- - - - - - - - - - - - - - - -- - - - - - - - Please indicate which of the following statements is correct:
(A) The stationarity measure is equal to 01 (1 + 0052 ) + 085 = 09503.
19

(B) cond var ng is a vector of conditional variances estimated using QMLE.


(C) There is an error in the code, since in the last line you should not include the vector sqrt
(cond var ng)among the inputs of fminsearch.
(D) df=df init by definition.
(E) l ng is the negative of the optimized value of the log-likelihood function for the NGARCH(1,1)
model, obtained when the unknown parameters are set to their (Q)MLE estimates.
Answer
E

Debriefing:
(A) False, because the stationarity measure is not calculated using the initialization coecients
contained in par initial, but rather using the parameters estimated by (Q)MLE and collected in the
vector param ng.
(B) False, because unless additional information is added, variances at this stage can be thought of
as still being estimated by MLE.
(C) False, because to estimate the degrees of freedom of the -Student by QMLE you will use
NGARCH conditional volatilities as an input, as if they were the true volatilities.
(D) False, because df init is just an initialization value, while df is the output of the QMLE
estimation. It is very unlikely that the two figures will be exactly equal.
(E) True, rather trivially if you have ever ventured to inspect contents and meaning of the logL1
procedure that is used to perform QML estimation of the degrees of freedom of a t-Student. Notice
that we are facing the negative of the optimized value of the likelihood function because we are using
fminsearch in this Matlab code.
Question 3.3
In one of your lab sessions, you have estimated 5% VaR measures from two competing multivariate
GARCH models, a Constant Conditional Correlation model (CCC) and a RiskMetrics Dynamic Conditional Correlation model (DCC). Recall that under a RiskMetrics-type DCC model, we have that the
auxiliary variable that is crucial in the behavior of the DCC is
+1 = (1 ) + ,
where the and the are first-step GARCH standardized residuals and 0 is initialized to equal the
sample correlation between and . Such VaR estimates are reported in the plot below for a 2008-2011

20

backtesting sample.

Indicate which of the following statements is/are correct:


(A) CCC and DCC VaR estimates are essentially the same (CCC is hardly visible as very close to the
= 0998;
green DCC time series) because in the lab we have estimated a RiskMetrics DCC parameter
when this happens, the DCC auxiliary variable follows a process +1 = 0002 + 0998 '
; however, this means that will be approximately constant over time and this implies that the

DCC conditional correlation will be also approximately constant and hence close to the CCC constant
correlation.
(B) CCC and DCC VaR estimates are essentially the same (CCC is hardly visible as very close
to the green DCC time series) because all active risk management methods are always bound to yield
similar VaR estimates, by construction.
(C) CCC and DCC VaR estimates are essentially the same (CCC is hardly visible as very close to the
= 0998;
green DCC time series) because in the lab we have estimated a RiskMetrics DCC parameter
when this happens, the DCC auxiliary variable follows a process +1 = 0998 + 0002 '

; however, this means that the will follow an approximate unit root process and this implies

that the DCC conditional correlation will be also approximately constant and hence close to the CCC
constant correlation.
(D) All VaR measures turned out to be very accurate, especially during the 2008-2009 financial
crisis.
(E) None of the above.
Answer
A

21

Debriefing:
(A) Correct, not much to addyour notes and the debriefing should show that you had found
= 0998 so that if you plug into +1 = (1 ) + you will find that in
an estimate

approximation +1 ' = 0 so that (especially when 0 is initialized to equal the sample

correlation between and ) DCC and CCC cannot dier by much.

(B) False, this is in general wrong because VaR depends on the specifics of the model and nothing
similar was ever stated, not even implied.
(C) False as this is just a version of A in which things have been turned around in a way that makes
= 0998 does not turn the process for into +1 = 0998 +
the answer plainly incorrect:
0002 ' ; in any event, even if were to follow an approximate unit root process, this

does not imply that the DCC conditional correlation will be also approximately constant.

(D) False, obviously (as commented during the debriefing), VaR measures performed rather poorly
especially during the 2008-2009 financial crisis.
(E) False, because we have seen that answer A is correct.
Question 3.4
Estimation of a simple three-state, first-order MSIH(3,0) model for the bivariate vector that collects
]0 ,
US and UK equity index returns, R+1 [+1
+1

R+1 = +1 + +1

+1 (0 +1 ),

where +1 = 1 2 3 has returned the smoothed probability plots in the row of graphs below:
MSIH(3,0) Model for US and UK Stock Returns
Regime1 SmoothedProbabilities

Regime3 SmoothedProbabilities

Regime2 SmoothedProbabilities

0.9

0.9

0.9

0.8

0.8

0.8

0.7

0.7

0.7

0.6

0.6

0.6

0.5

0.5

0.5

0.4

0.4

0.4

0.3

0.3

0.3

0.2

0.2

0.2

0.1

0.1

0.1

10

20

30

40

50

60

70

80

90

100

10

20

30

40

50

60

70

80

90

100

10

20

30

40

50

60

70

80

An analogous exercise of estimation of a three-state, first-order MSIH(3,0) model for the bivariate
]0 , has instead returned
vector that collects US and Greek equity index returns, R+1 [+1
+1

22

90

100

the smoothed probability plots in the row of graphs below:


MSIH(3,0) Model for US and Greek Stock Returns
Regime3 SmoothedProbabilities

Regime2 SmoothedProbabilities

Regime1 SmoothedProbabilities
1

0.9

0.9

0.9

0.8

0.8

0.8

0.7

0.7

0.7

0.6

0.6

0.6

0.5

0.5

0.5

0.4

0.4

0.4

0.3

0.3

0.3

0.2

0.2

0.2

0.1

0.1

0.1

10

20

30

40

50

60

70

80

90

100

10

20

30

40

50

60

70

80

90

100

10

20

30

40

50

60

70

80

90

Please indicate which of the following statements is/are correct:


(A) Both models have to be rejected and something has gone clearly wrong with the estimation
]0 and [ ]0 ,
because in a MSIH(3,0) model for a bivariate vector of data, such as [+1
+1
+1
+1

you would expect to be shown two plots of smoothed probabilities in each of the rows, one for each
equity markets, and not three.
]0 , the model can hardly classify
(B) In the case of the Markov switching model for [+1
+1

most of the periods in the estimation sample between regimes 1, 2, and 3 which implies the existence
of considerable uncertainty as to the nature of the underlying regime.
(C) The smoothed probabilities for the two vectors of return data are likely to come from two
MSIH(3,0) models in which the estimated transition matrices have a structure like

082 011 007


065 018 017

P
= 0
= 022 049 029
P
1
0
004 006 090
025 022 053

(D) Both models have to be rejected and something has gone clearly wrong with the estimation

because the smoothed state probabilities for each regime across the two rows of plot (i.e., the probabilities
in the first plot in the first row plus the probabilities in the first plot in the second row; the probabilities
in the second plot in the first row plus the probabilities in the second plot in the second row, etc.) fail
to sum to one, and they should.
(E) In both applications, the model MSIH(3,0) seems inappropriate to fit the two vectors of stock
]0 and [ ]0 ; in the former case, because the underlying Markov chain
return data, [+1
+1
+1
+1

seems to be reducible to a single state model at approximately half of the sample; in the latter case,
because there seems to be no clear regime structure.
Answer
B, C, E
Debriefing:
(A) False, as discussed in the many examples in lecture 5 and seen in the labs, o a MSIH(3,0) model
for a bivariate vector of data, one expects to find a number of smoothed probability plots equal to the
23

100

number of regimes, here = 3, not equalling the number of stock return series appearing in R+1 , here
= 2.
(B) Correct, this is exactly the information that is conveyed by the second row of plots concerning
]0 .
the smoothed probabilities of the MSIH(3,0) concerning [+1
+1
implies considerable persistence of all regimes; however, it is also
(C) Correct, because P

clear that after the first hit at regime 2, the system is trapped in regime 2, which is a plausible
implies that
indication of failure of ergodicity of the underlying Markov chain; on the contrary, P
no regimes are really persistent and there will be continuous switches among them.
(D) False, it is incorrect (nonsensical) to try and sum probabilities for a given regime and across
]0
dierent MSIH(3,0) models, that will actually concern dierent sets of data; the fact that [+1
+1
]0 have in common makes no dierence from this point of view.
and [+1
+1
+1

(E) Correct, when applied to US and Greek data, the MSIH(3,0) itself is trying to warn you that the
regime definition provided by the data is extremely weak so that no particular story about the nature
of the three states will be really possible; for instance, the smoothing algorithm seems to keep warning
you that some probability of having entered regime 2 exist, but such probabilities often fail to exceed
0.5; this is probably a case in which either the data contain no regimes or at least the structure of the
MSIH(3,0) ought to be re-specified. As for the US and UK data vector, see (C) and the fact that, as
commented in the lectures, Markov switching models are usually estimated under the assumption of
ergodicity.

24

Financial Econometrics
Rules of conduct during exams or other tests.
During exams, students must remain quiet and may not use any external support aids, whether
paper or digital (e.g. manuals, lecture notes, personal papers, books, publications, cell phones, handheld
computers or other electronic devices), if not expressly authorized by the teacher in class. In addition,
students may not copy or look at other students exam paper or contact or attempt to contact other
people in any way. Students must remain in the classroom for the whole of the time and only for the
time needed to finish his or her exam, unless teachers in class give other orders. Students who have
questions for the teacher must raise their hand and wait for the examiner to come to them.
At the end of the exam, students must return the exam script and the exam paper to the examining
faculty member and leave the room.
Any breach of these regulations or any other orders given by the faculty member present at the time
of the exam will result in the test being cancelled and an ocial report sent to the Disciplinary Board
in all cases.
All disciplinary sanctions will be recorded in the students academic career. Sanctions greater than
a warning will result in forfeiture of benefits for the right to study (scholarships, housing etc.).
The Honor Code and detailed regulations for taking exams and other tests are published on the
University website http://www.unibocconi.eu/honorcode.

Name and Surname (CAPITAL LETTERS)

Personal ID

Signature: I hereby undertake to respect the regulations described above and undersign my presence at the
exam.

25

You might also like