Merton’s Structural Model and Extension

Pioneered by Merton (1974) and Black and Scholes (1973), structural (or asset value) model is one of the two primary classes of credit risk modeling approaches (The other one is the reduced form model.). It assumes that at time t a firm with risky assets A_{t} is financed by equity E_{t} and zero-coupon debt D_{t} of face value K maturing at time T>t: A_{t}=E_{t}+D_{t}.

When the firm’s asset is valued more than its debt A_{T}\geqslant K at time T, the debt holders will be paid the full amount K and the shareholders’ equity will be \left(A_{T}-K\right). On the other hand, when the firm fails to repay (therefore defaults on) the debt at T, the debt holders can only recover A_{T}<K and the sharehodlers will get nothing.[ref]Here is sidenote test page.[/ref] The equity value at time T can be represented as an European call option on asset A_{t} with strike price K maturing at T, E_{T}=\max\left(A_{T}-K,0\right). The asset value is assumed to follow a geometric Brownian motion process, with risk-neutral dynamics given

(1)   \begin{equation*} dA_{t}=rA_{t}dt+\sigma_{A}A_{t}dW_{t} \end{equation*}

where r denotes the risk-free interest rate, \sigma_{A} is the volatility of asset’s returns, and W_{t} is a Brownian motion under the risk-neutral measure. Applying Black-Scholes formula would give

    \[ E_{t}=A_{t}\Phi\left(d_{1}\right)-Ke^{-r\left(T-t\right)}\Phi\left(d_{2}\right) \]

where d_{1}=\frac{1}{\sigma_{A}\sqrt{T-t}}\left[\ln\left(\frac{A_{t}}{K}\right)+\left(r+\frac{\sigma_{A}}{2}\right)\left(T-t\right)\right], d_{2} = d_{1}-\sigma_{A}\sqrt{T-t}, and \Phi\left(\cdot\right) denotes the standard normal \textit{cdf}. The probability of default at time T is given by \textrm{P}\left(A_{T}<k\right)=\phi\left(-d_{2}\right). <=”" p=”">

A typical strategy of debt holders to protect themselves from the credit risk is to long a put option P_{t} on A_{t} with strike K maturing at T. The put option will be valued at \left(K-A_{T}\right) if A_{T}<k, and=”" worth=”" nothing=”" if=”" a_{t}="">K. Purchasing the put option guarantees that the credit risk of the loan is hedged completely as the debt holder’s payoff equals K at maturity no matter if the obligor defaults or not. It therefore forms a risk-free position

(2)   \begin{equation*} D_{t}+P_{t}=Ke^{-r\left(T-t\right)}. \end{equation*}

The price of put option P_{t} is determined by applying Black-Scholes formula as

(3)   \begin{equation*} P_{t}=Ke^{-r\left(T-t\right)}\Phi\left(-d_{2}\right)-A_{t}\Phi\left(-d_{1}\right). \end{equation*}

Taking account the credit risk spread (risk premium) s, the value of the risky bond is

(4)   \begin{equation*} D_{t}=Ke^{-\left(r+s\right)\left(T-t\right)}. \end{equation*}

Combining Eq.(2) — (4) gives a closed-form formula for the credit spread

    \[ s=-\frac{1}{T-t}\ln\left[\Phi\left(d_{2}\right)-\frac{A_{t}}{K}e^{r\left(T-t\right)}\Phi\left(-d_{1}\right)\right] \]

where \frac{A_{t}}{K} represents the firm’s leverage. Note that s depends only on A_{t} and \sigma_{A} which is in line with the economic intuition. Their nonlinear relationship can be observed from the below figures.

Many approaches have been proposed to improve the classical Merton’s model. The first passage model introduced by Black and Cox (1976) allows the firm may default at any time before the debt maturity. Jones et al. (1984) suggest to introduce stochastic interest rates to improve the model’s performance. Longsta and Schwartz (1995) employ a Vasicek process for the interest rate, dr_{t}=\left(a-br_{t}\right)dt+\sigma_{t}dW_{t}^{\left(r\right)}, while Kim et al. (1993) consider a CIR process, dr_{t}=\left(a-br_{t}\right)dt+\sigma_{t}\sqrt{r_{t}}dW_{t}^{\left(r\right)}, and Briys and De Varenne (1997) treat the interest rate following a generalized Vasicek process, dr_{t}=\left(a\left(t\right)-\left(t\right)r_{t}\right)dt +\sigma_{t}\left(t\right)dW_{t}^{\left(r\right)}. By comparing the Merton’s model and its four extensions Eom et al. (2004) find substantial spread predication errors that four models underestimate the spread observed from the market while the other one overestimate it.


  • Black, F., Cox, J. C., 1976. Valuing Corporate Securities: Some Effects of Bond Indenture Provisions. Journal of Finance 31, 351-367.
  • Black, F., Scholes, M., 1973. The Pricing of Option and Corporate Liabilities. Journal of Political Economy 81, 637-654.
  • Briys, E., De Varenne, F., 1997. Valuing Risky Fixed Rate Debt: An Extension. Journal of Financial and Quantitative Analysis 32 (2).
  • Eom, Y., Helwege, J., Huang, J., 2004. Structural Models of Corporate Bond Pricing: An Empirical Analysis. Review of Financial Studies 17 (2), 499-544.
  • Jones, E., Mason, S., Rosenfeld, E., 1984. Contingent Claims Analysis of Corporate Capital Structures. Journal of Finance 39 (3), 611-625.
  • Kim, I. J., Ramaswamy, K., Sundaresan, S., 1993. Does Default Risk in Coupons Affect the Valuation of Corporate Bonds?: A Contingent Claims Model. Financial Management, 117-131.
  • Longstaff, F. A., Schwartz, E. S., 1995. A Simple Approach to Valuing Risky Fixed and Floating Rate Debt. Journal of Finance 50, 789-819.
  • Merton, R. C., 1974. On the Pricing of Corporate Debt: The Risk Structure of Interest Rates. Journal of Finance 2 (2), 449-470.

Risk Measures

Let’s consider that there are different types (i.e. distributions) of assets, all with the same volatility and mean. The standard mean-variance analysis indicates that all these assets are equally risky. In reality market, however, participants view the risk in them differently.

In practice, the problem of risk comparisons is difficult because the underlying distribution of market prices and returns of various assets is unknown. One can try

  • to identify the distribution by maximum likelihood methods
  • test the distributions against other distributions by using methods such as the Kolmogorov-Smirnov test

Practically, it is impossible to accurately identify the distribution of financial returns.

The most common approach to the problem of comparing the risk of assets having different distributions is to employ a risk measure that represents the risk of an asset as a single number that is comparable across assets.

Three risk measures: Volatility, Value-at-Risk, Expected Shortfall


It is sufficient as a risk measure only when financial returns are normally distributed.


VaR is a single summary statistical measure of risk. It is distribution independent.

The three steps in VaR calculations:

  1. to specify the probability, p, of losses exceeding VaR: 1% (the most common); 0.1% for applications like economic capital or long-run risk analysis for pension funds
  2. to specify the holding period: usually one day
  3. to specify the probability distribution of the P/L of the portfolio: by using past observations and a statistical model.

There are three main issues that arise in the implementation of VaR:

  • VaR is only a quantile on the P/L distribution.
  • VaR is not a coherent risk measure: not subadditivity. It is subadditive in the special case of normally distributed returns.
  • VaR is easy to manipulate

Expected Shortfall

Also be known as tail VaR or conditional VaR (CVaR). It measures the expected loss when losses exceed VaR.

The ES is the negative expected value of P/L over the tail density f_{\textrm{VaR}}\left(\cdot\right)

    \begin{eqnarray*} \textrm{ES} & = & -\left[Q\mid Q\leqslant-\textrm{VaR}\left(p\right)\right]\\ & = & -\int_{-\infty}^{-\textrm{VaR}\left(p\right)}xf_{\textrm{VaR}}\left(x\right)dx \end{eqnarray*}

If the P/L distribution is standard normal, then

    \[ \textrm{ES}=-\frac{\phi\left(\Phi^{-1}\left(p\right)\right)}{p} \]

where \phi and \Phi are the normal density and distribution respectively.

Here is a sample R code

Advantages of using ES:

  1. Any bank that has a VaR-based risk management system could implement ES without much additional effort.
  2. ES is subadditive while VaR is not.

However, in practice the vast majority of financial institutions employ VaR and not ES. The reasons may be:

  1. ES is measured with more uncertainty than VaR. The first step in ES estimation is ascertaining the VaR and the second step is obtaining the expectation of tail observations. This means that there are at least two sources of error in ES.
  2. More importantly, ES is much harder to backtest than VaR because the ES procedure requires estimates of the tail expectation to compare with the ES forecast. Therefore, in backtesting, ES can only be compared with the output from a model while VaR can be compared with actual observations.

Holding Periods

In practice, the most common holding period is daily, but many other holding periods are also employed: e.g. hourly (or every 20/10-min) 90% VaR is used on the trading floor.

Basel Accords require financial institutions to model risk using 10-day holding periods. The majority of risk managers employ scaling laws to obtain such risk levels.

Square-root-of-time scaling \sqrt{T}

It supposes the observed random variables \left\{X_{t}\right\} are IID with variance \sigma^2 over time. The variance of sum of T consecutive Xs is then

    \[ \textrm{Var}\left(X_{t}+X_{t+1}+\ldots+X_{t+T}\right)=\textrm{Var}\left(X_{t}\right)+\textrm{Var}\left(X_{t+1}\right)+\ldots+\textrm{Var}\left(X_{t+T}\right)=T\sigma^{2} \]

This implies that volatility scales up by \sqrt{T}.

The square-root-of-time scaling rule does not apply to VaR unless we assume the returns are normal. It should not be considered to obtain multi-day VaR forecasts by scaling up daily VaR using \sqrt{T}, although the 1996 amendment of Basel Accords explicitly recommends to do so.

Multivariate Volatility Models

Most applications deal with portfolios where it is necessary to forecast the entire covariance matrix of asset returns.

Consider the univariate volatility model:

    \[ Y_{t} = \sigma_{t} Z_{t} \]

where Y_{t} are returns; \sigma_{t} is conditional volatility, and Z_{t} are random shocks.


The multivariate form of EWMA is

    \[ \hat{\Sigma}_{t}=\lambda\hat{\Sigma}_{t-1}+\left(1-\lambda\right)y_{t-1}^{\prime}y_{t-1} \]

with an individual element given by

    \[ \hat{\sigma}_{t,ij}=\lambda\hat{\sigma}_{t-1,ij}+\left(1-\lambda\right)y_{t-1,i}y_{t-1,j}\quad i,j=1,\ldots,K \]

where \lambda = 0.94 as per RiskMetrics.

A sample R code for EWMA is

Orthogonal GARCH (OGARCH)

It is usually very hard to estimate multivariate GARCH models. In practice, alternative methodologies for obtaining the covariance matrix are needed.

The orthogonal approach transforms linearly the observed returns matrix into a set of portfolios with the key property that they are uncorrelated, implying we can forecast their volatilities separately. This makes use of principal components analysis (PCA).

Orthogonalising covariance

The first step is to transform the return matrix y^{\left\{T\times K\right\}} into uncorrelated portfolio u^{\left\{T\times K\right\}}. Denote \hat{R}^{\left\{K\times K\right\}} as the sample correlation of y^{\left\{T\times K\right\}}. We then calculate orthogonal matrix of eigenvectors of \hat{R}^{\left\{K\times K\right\}}, denoted by \Lambda^{\left\{K\times K\right\}}. Then u^{\left\{T\times K\right\}} is defined by:

    \[ u^{\left\{T\times K\right\}}=\Lambda^{\left\{K\times K\right\}} \times y^{\left\{T\times K\right\}}. \]

The rows of u^{\left\{T\times K\right\}} are uncorrelated with each other so we can run a univariate GARCH or a similar model on each row in u^{\left\{T\times K\right\}} separately to obtain its conditional variance forecast, denoted by D_{t}. We then obtain the forecast of the conditional covariance matrix of the returns by:

    \[ \hat{\Sigma}_{t}=\Lambda \hat{D}_{t} \Lambda^{\prime}. \]

This implies that the covariance terms can be ignored when modeling the covariance matrix of u, and the problem has been reduced to a series of univariate estimations.

Large-scale implementations

In the above example, all the principal components (PCs) were used to construct the conditional covariance matrix. However, it is possible to use just a few of the columns. The highest eigenvalue corresponds to the most important principle component—the one that explains most of the variation in the data.

Such approaches are in widespread use because it is possible to construct the conditional covariance matrix for a very large number of assets. In a highly correlated environment, just a few principal components are required to represent system variation to a very high degree of accuracy. This is much easier than forecasting all volatilities directly in one go.

PCA also facilitates building a covariance matrix for an entire financial institution by iteratively combining the covariance matrices of the various trading desks, simply by using one or perhaps two principal components. For example, one can create the covariance matrices of small caps and large caps separately and use the first principal component to combine them into the covariance matrix of all equities. This can then be combined with the covariance matrix for fixed income assets, etc.

Correlation Models

Constant conditional correlations (CCC)

Bollerslev (1990) proposes the constant conditional correlations (CCC) model where time-varying covariances are proportional to the conditional standard deviation. The conditional covariance matrix \hat{\Sigma}_{t} consists of two components that are estimated separately: sample correlations \hat{R} and the diagonal matrix of time-varying volatilities \hat{D}_{t}.

    \[ \hat{\Sigma}_{t} = \hat{D}_{t} \hat{R} \hat{D}_{t} \]


    \[ \hat{D}_{t}=\left(\begin{array}{ccc} \hat{\sigma}_{t,1} & 0 & 0\\ 0 & \ddots & 0\\ 0 & 0 & \hat{\sigma}_{t,K} \end{array}\right). \]

The volatility of each asset \hat{\sigma}_{t,k} follows a GARCH process or any of the univariate models discussed here.

This model guarantees the positive definiteness of \hat{\Sigma}_{t} if \hat{R} is positive definite.

Dynamic conditional correlations (DCC)

In particular, the assumption of correlations being constant over time is at odds with the vast amount of empirical evidence supporting nonlinear dependence. To correct this defect, Engle (2002) and Tse and Tsui (2002) propose the dynamic conditional correlations (DCC) model as an extension to the CCC model.

Different with CCC model, the correlation matrix is time dependent within the DCC framework as

    \[ \hat{R}_{t} = \hat{Q}_{t}^{\prime} \hat{Q}_{t} \]

where \hat{Q}_{t} is a symmetric positive definite autoregressive matrix and is given by

    \[ \hat{Q}_{t}=\left(1-\zeta-\xi\right)\bar{Q}+\zeta Y_{t-1}^{\prime}Y_{t-1}+\xi\hat{Q}_{t-1} \]

where \bar{Q} is the K\times K unconditional covariance matrix of Y; \zeta,\xi > 0 and \zeta + \xi <1 to ensure positive definiteness and stationarity, respectively.

  • Pros: it can be estimated in two steps: one for parameters determining univariate volatilities and another for parameters determining the correlations.
  • Cons: parameters \zeta and \xi are constants implying that the conditional correlations of all assets are driven by the same underlying dynamics — often an unrealistic assumption.

When we compare the correlations estimated by the above three models: EWMA, OGARCH and DCC, we will find the correlation forecasts for EWMA seem to be most volatile. Both DCC and OGARCH models have more stable correlations with the OGARCH having the lowest fluctuations but the highest average correlations. The large swings in EWMA correlations might be an overreaction.

Multiariate Extensions of GARCH

It is conceptually straightforward to develop multivariate extensions of the univariate GARCH-type models — such as multivariate GARCH (MVGARCH). Unfortunately, it is more difficult in practice because the most obvious model extensions result in the number of parameters exploding as the number of assets increases.

The BEKK model

There are a number of alternative MVGARCH models available, but the BEKK model, proposed by Engle and Kroner (1995), is probably the most widely used. The matrix of conditional covariances

The general BEKK \left(L_{1} ,L_{2} ,K \right) model is given by

    \[ \Sigma_{t}=\Omega\Omega^{\prime}+\sum_{k=1}^{K}\sum_{i=1}^{L_{1}}A_{i,k}^{\prime}Y_{t-i}^{\prime}Y_{t-i}A_{i,k}+\sum_{k=1}^{K}\sum_{j=1}^{L_{2}}B_{j,k}^{\prime}\Sigma_{t-j}B_{j,k} \]

The number of parameters in the BEKK(1,1,2) model is K(5K+1)/2, i.e. 11 in 2-asset case.

    \begin{eqnarray*} \Sigma_{t} & = & \left(\begin{array}{cc} \sigma_{t,11} & \sigma_{t,12}\\ \sigma_{t,12} & \sigma_{t,22} \end{array}\right)\\  & = & \underbrace{\left(\begin{array}{cc} \omega_{11} & 0\\ \omega_{21} & \omega_{22} \end{array}\right)}_{\Omega}\underbrace{\left(\begin{array}{cc} \omega_{11} & 0\\ \omega_{21} & \omega_{22} \end{array}\right)^{\prime}}_{\Omega^{\prime}}+\underbrace{\left(\begin{array}{cc} \alpha_{11} & \alpha_{12}\\ \alpha_{21} & \alpha_{22} \end{array}\right)^{\prime}}_{A^{\prime}}\underbrace{\left(\begin{array}{cc} Y_{t-1,1}^{2} & Y_{t-1,1}Y_{t-1,2}\\ Y_{t-1,2}Y_{t-1,1} & Y_{t-1,2}^{2} \end{array}\right)}_{Y_{t-1}^{\prime}Y_{t-1}}\underbrace{\left(\begin{array}{cc} \alpha_{11} & \alpha_{12}\\ \alpha_{21} & \alpha_{22} \end{array}\right)}_{A}\\  &  & +\underbrace{\left(\begin{array}{cc} \beta_{11} & \beta_{12}\\ \beta_{21} & \beta_{22} \end{array}\right)^{\prime}}_{B^{\prime}}\underbrace{\left(\begin{array}{cc} \sigma_{t-1,11} & \sigma_{t-1,12}\\ \sigma_{t-1,21} & \sigma_{t-1,22} \end{array}\right)}_{\Sigma_{t-1}}\underbrace{\left(\begin{array}{cc} \beta_{11} & \beta_{12}\\ \beta_{21} & \beta_{22} \end{array}\right)}_{B} \end{eqnarray*}

where \omega, \alpha and \beta are coefficients. We can find the simple idea behind the BEKK and DCC models are similar that the volatilities/correlations are dependent on their past realisations and the shocks from squared financial asset returns.

  • Cons: too many parameters. This implies those parameters may be hard to interpret. Furthermore, many parameters are often found to be statistically insignificant, which suggests the model may be overparameterized.

Extreme Value Theory

Analogous with the central limit theorem, where the normal distribution acts the limit for the distribution of the mean of a large number i.i.d. random variables, the extreme value theory (EVT) investigates the limit distribution of the sample maximum.

Empirical models of financial returns based on distributional assumptions such as Gaussian, Student’s t and GED are often chosen based on their ability to t data near the mode given that only a few observations fall in the distribution tails by definition. But effective risk management requires accurate estimation of the likelihood of rare events that could trigger catastrophic losses. Extreme value theory can be useful for this purpose because it is specifically aimed at modelling tail behaviour without requiring assumptions on the entire distribution, i.e. it provides a semi-parametric model for the tails of distribution functions.

Pros: much more accurate for applications focusing on the extremes
Cons: don’t have that many extreme observations

EVT can be useful to explicitly identify the type of asymmetry in the extreme tails.

Regardless of the overall shape of the distribution, the tails of all distributions fall into one of three categories as long as the distribution of an asset return series does not change over time:

  • Weibull: Thin tails where the distribution has a finite endpoint
  • Gumbel: Tails decline exponentially
  • Frechet: Tails decline by a power law

Block maxima and peaks-over-threshold are the two main EVT modeling methodologies.

Generalized extreme value distribution

Let \{x_{t}\},\: t=1,..,T, denote an iid process with distribution F\left(x\right). The maximum of a block of n<t observations,=”" called block maximum and denoted M_{n}=\max\left(x_{1},\ldots,x_{n}\right), follows asymptotically the probability distribution

    \begin{equation*} \textrm{P}\left[\frac{M_{n}-b_{n}}{a_{n}}\leqslant y\right]=F^{n}\left(a_{n}y+b_{n}\right)\rightarrow G\left(y\right),\qquad n\rightarrow+\infty \end{equation*}

as n\rightarrow+\infty for all y\in\mathbb{R}, where a_{n}>0 and b_{n} are appropriate constants, F^{n}\left(\cdot\right) is F\left(\cdot\right) raised to power of n, and G\left(\cdot\right) is a non-degenerate distribution function. According to the Extremal Types Theorem, the block maxima distribution G\left(\cdot\right) must be either Frechet, negative Weibull or Gumbel; these three distributions can be cast as members of the Generalized Extreme Value distribution (GEV) with cdf given by

    \begin{equation*} G\left(y\right)=\begin{cases} \exp\left\{ -\left(1+\xi\frac{y-\mu}{\beta}\right)^{-1/\xi}\right\} & \quad\xi\neq0\\ \exp\left\{ -e^{-\frac{y-\mu}{\beta}}\right\} & \quad\xi=0 \end{cases}, \end{equation*}

where \mu,\:\beta>0 and \xi are location, scale and shape parameters, respectively.

GED becomes the Frechet distribution for \xi>0, the negative Weibull distribution for \xi<0, and the Gumbel distribution for \xi=0.

Generalized Pareto distribution

Let \{x_{t}-u\}\: t=1,..,T, denote the exceedances or peaks-over-threshold process where x_{t}>u and u denotes a threshold loss. The exceedances distribution can be formalized as

    \[ \Pr\left[x_{t}-u\leqslant y\mid x_{t}>u\right]=\frac{F\left(y+u\right)-F\left(u\right)}{1-F\left(u\right)}\rightarrow H\left(y\right),\quad t=1,\ldots,T. \]

According to the Pickands-Balkema-de-Haan Theorem, for a sufficiently large threshold loss u, the exceedances distribution can be approximated by the Generalized Pareto Distribution (GPD) as

    \begin{equation*} H\left(y\right)=\begin{cases} 1-\left(1+\xi\frac{y}{\beta}\right)^{-1/\xi} & \quad\xi\neq0\\ 1-\exp\left\{ -\frac{y}{\beta}\right\} & \quad\xi=0 \end{cases}, \end{equation*}

where \beta>0 and \xi are scale and shape parameters, respectively. GPD nests the exponential distribution (\xi=0), the heavy-tailed Pareto Type I distribution (\xi>0) and the short-tailed Pareto Type II distribution (\xi<0).

The parameters of GPD are estimated by maximizing the corresponding log-likelihood function

    \begin{eqnarray*} \ln\mathfrak{L}(y_{1},\ldots,y_{N_{u}};\beta,\xi) & = & \sum_{j=1}^{N_{u}}\ln h\left(y_{j};\beta,\xi\right)\\  & = & -N_{u}\ln\beta-\left(1+\frac{1}{\xi}\right)\sum_{j=1}^{N_{u}}\ln\left(1+\xi\frac{y_{j}}{\beta}\right) \end{eqnarray*}

where N_{u} is the total number of observed exceedances y_{j}\equiv x_{j}-u for given threshold u.

Hill Method

Alternatively, one can use Hill method to estimate the tail distribution.

Finding the threshold

Several methods have been proposed to determine the optimal threshold.

  1. The most common approach is the eyeball method where we look for a region where the tail index seems to be stable.
  2. More formal methods are based on minimizing the mean squared error (MSE) of the Hill estimator

Univariate Volatility Modelling

A key modeling difficulty is that market volatility is not directly observable — unlike market prices it is a latent variable. Volatility must therefore be inferred by looking at how much market prices move.

We usually assume that mean return is zero. While this is obviously not correct, the daily mean is orders of magnitude smaller than volatility and therefore can usually be safely ignored for the purpose of volatility forecasting.

Moving average (MA) model

The most obvious and easy way to forecast volatility is simply to calculate the sample standard error from a sample of returns. Over time, we would keep the sample size constant, and every day add the newest return to the sample and drop the oldest. This method is called the moving average (MA) model.

    \[\hat { \sigma } ^{ 2 }_{ t }=\frac { 1 }{ W_{ E } } \sum _{ t=1 }^{ W _E}{y^2_{t-i} }\]

One key shortcoming of MA models is that observations are equally weighted. In practice, this method should not be used. It is very sensitive to the choice of estimation window length.

Exponentially weighted moving average (EWMA) model a.k.a. RiskMetrics

The moving average model can be improved by assigning greater weights to more recent observations.

    \begin{eqnarray*} \sigma_{t}^{2} & = & \left(1-\lambda\right)\lambda^{1}y_{t-1}^{2}+\left(1-\lambda\right)\lambda^{2}y_{t-2}^{2}+\cdots\\  & = & \lambda\sigma_{t-1}^{2}+\left(1-\lambda\right)y_{t-1}^{2} \end{eqnarray*}

RiskMetrics is a branded EWMA by setting \lambda=0.94:

    \[ \sigma_{t}^{2}=0.94\sigma_{t-1}^{2}+0.06y_{t-1}^{2} \]

EWMA can be thought as a special case of GARCH(1,1):

    \[ \begin{array}{rl} \begin{array}{r} GARCH(1,1):\\ \\ \end{array} & \begin{array}{ll} \sigma_{t}^{2} & =\omega+\alpha y_{t-1}^{2}+\beta\sigma_{t-1}^{2}\\  & =\alpha y_{t-1}^{2}+\left(1-\alpha\right)\sigma_{t-1}^{2}\leftarrow\textrm{set }\omega=0,\alpha+\beta=1 \end{array}\end{array} \]

Cons: \lambda is constant and identical for all assets.

Pros: 1) it can be implemented much more easily than most alternatives; 2) multivariate forms can be applied in a straightforward fashion. Coupled with the fact that it often gives reasonable forecasts, EWMA is often the method of choice.

GARCH model and its extension models

Most volatility models are based on using returns that have been de-meaned (i.e., the unconditional mean has been subtracted from the returns). For random variables (RVs) Y_t, de-meaned means E(Y_{t})=0.

The innovation in returns is driven by random shocks Z_t where Z_{t} \sim D(0,1).

The return Y_t can then be indicated by:

    \[Y_{t} = \sigma_{t} Z_{t}\]



where p is the number of lags.

One of the biggest problems with the ARCH model concerns the long lag lengths required to capture the impact of historical returns on current volatility.



where p and q are the order of ARMA and GARCH respectively; \omega , \alpha, \beta > 0 and \alpha + \beta <1. The unconditional volatility of GARCH(1,1) is given by


    \begin{eqnarray*} \sigma^{2} & = & E\left(\omega+\alpha Y_{t-1}^{2}+\beta\sigma_{t-1}^{2}\right)\\ & = & \omega+\alpha\sigma^{2}+\beta\sigma^{2}\\ & = & \frac{\omega}{1-\alpha-\beta} \end{eqnarray*}

The unconditional volatility will be infinite when \alpha + \beta = 1 and be undefined when \alpha + \beta <1.

Multiperiod volatility

To obtain the volatility n-days-ahead:

    \[\sigma_{t+n|t}^{2}=\sigma^{2}+\left(\alpha+\beta\right)^{n-1}\left(\sigma_{t+1}^{2}-\sigma^{2}\right),\; n\geq1\]

At t+1, the unconditional volatility can be expressed as

    \begin{eqnarray*} \sigma_{t+1 \mid t}^{2}=E_{t}\left(Y_{t+1}^{2}\right) & = & \omega+\alpha Y_{t}^{2}+\beta\sigma_{t}^{2}\\ & = & \underbrace{\omega+\left(\alpha+\beta\right)\sigma^{2}}+\alpha\left(Y_{t}^{2}-\sigma^{2}\right)+\beta\left(\sigma_{t}^{2}-\sigma^{2}\right)\\ & = & \hspace{3em}\sigma^{2}\hspace{3em}+\alpha\left(Y_{t}^{2}-\sigma^{2}\right)+\beta\left(\sigma_{t}^{2}-\sigma^{2}\right) \end{eqnarray*}

We can now derive two-step-ahead volatility:

    \begin{eqnarray*} \sigma_{t+2\mid t}^{2}=E_{t}\left(Y_{t+2}^{2}\right) & = & E_{t}\left(E_{t+1}\left(Y_{t+2}^{2}\right)\right)\\ & = & E_{t}\left(\sigma+\alpha\left(Y_{t+1}^{2}-\sigma^{2}\right)+\beta\left(\sigma_{t+1}^{2}-\sigma^{2}\right)\right)\\ & = & \sigma^{2}+\alpha\left(E_{t}\left(Y_{t+1}^{2}\right)-\sigma^{2}\right)+\beta\left(\sigma_{t+1}^{2}-\sigma^{2}\right)\\ & = & \sigma^{2}+\left(\alpha+\beta\right)\left(\sigma_{t+1}^{2}-\sigma^{2}\right) \end{eqnarray*}

If  \alpha + \beta < 1, the second term above goes to zero as n \rightarrow \infty, which implies that the longer the forecast horizon, the closer the forecast will get to unconditional variance. The smaller (\alpha + \beta) the quicker the predictability of the process subsides.

(G)ARCH in mean

The return on a risky security should be positively related to its risk. The conditional mean of a return, \mu_{t}, is dependent on some function of its conditional variance or standard deviation:

    \[Y_{ t }=\mu _{ t }+\sigma _{ t }Z_{ t }=\delta \sigma ^{ 2 }_{ t }+\sigma _{ t }Z_{ t }\]

where \delta is the parameter describing the impact volatility has on the mean.

Maximum Likelihood Estimation

The nonlinear nature of the volatility models rules out estimation by standard linear regression methods such as OLS. The estimation is therefore using quasi-maximum likelihood (QML) approach.

Assuming the normal distribution, the density of the returns with GARCH(1,1) at t=2 is given by:

    \[f\left(y_{2}\right)=\frac{1}{\sqrt{2\pi\left(\omega+\alpha y_{1}^{2}+\beta\hat{\sigma}_{1}^{2}\right)}}\exp\left(-\frac{y_{2}^{2}}{2\left(\omega+\alpha y_{1}^{2}+\beta\hat{\sigma}_{1}^{2}\right)}\right)\]

The joint density of y is:

    \begin{eqnarray*} \prod_{t=2}^{T}f\left(y_{t}\right)=\prod_{t=2}^{T}\frac{1}{\sqrt{2\pi\left(\omega+\alpha y_{t-1}^{2}+\beta\hat{\sigma}_{t-1}^{2}\right)}}\exp\left(-\frac{y_{t}^{2}}{2\left(\omega+\alpha y_{t-1}^{2}+\beta\hat{\sigma}_{t-1}^{2}\right)}\right) \end{eqnarray*}

The log-likelihood function is then:

    \begin{eqnarray*} \log\mathcal{L}=\underbrace{-\frac{T-1}{2}\log\left(2\pi\right)}_{\textrm{constant}}-\frac{1}{2}\sum_{t=2}^{T}\left(\log\left(\omega+\alpha y_{t-1}^{2}+\beta\hat{\sigma}_{t-1}^{2}\right)+\frac{y_{t}^{2}}{2\left(\omega+\alpha y_{t-1}^{2}+\beta\hat{\sigma}_{t-1}^{2}\right)}\right) \end{eqnarray*}

One way is to set \sigma_{1} to an arbitrary value, usually the sample variance of {y_{t}} for large sample sizes.

When the data experience structural break, e.g. the 07-08 credit crunch, we will get very different values of estimates if we set \sigma_{1} to unconditional volatility comparing with the case that we use EWMA to set initial volatility.

Goodness of fit tests
  • Likelihood ratio tests and parameter significance
  • If models are nested, for example ARCH(1) against ARCH(4), one can form the LR test

        \[ \textrm{LR} = 2\left(\mathcal{L_{U}}-\mathcal{L_{R}}\right)\sim \chi^{2}_{\textrm{# restrictions}} \]

    where the number of restrictions is 3 in our case.

    In out-of-sample forecast comparisons, it is often the case that the more parsimonious models perform better, even if a more flexible model is significantly better in sample. If the more flexible model is not significantly better in sample, it is very unlikely to do better out of sample.

  • Analysis of model residuals
  • Consider the normal ARCH(1) model. If the model is correct, the residuals are iid normal. So we can test the normality for the fitted or estimated residuals:

        \[ \hat{z}_{t}=\frac{y_{t}}{\hat{\sigma}_{t}\left(\hat{\alpha}, \hat{\beta} \right)} \sim N\left(0,1 \right) \]

    One can use Jarque–Bera test for normality and Ljung–Box test for autocorrelations.

  • Statistical goodness-of-fit measures
  • Competing models can be ranked by goodness-of-fit measures, such as mean squared error (MSE). But the conditional variance is not observable even ex post, and hence volatility proxies, s_{t}, are required. The simplest volatility proxy is the squared return.

        \[ \begin{array}{rl} \textrm{Squared error}: & \sum_{t=1}^{T}\left(\hat{s_{t}^{2}-\hat{\sigma}_{t}^{2}}\right)^{2}\\ \textrm{QLIKE}: & \sum_{t=1}^{T}\left(\log\hat{\sigma}_{t}^{2}+\frac{\hat{s}_{t}^{2}}{\hat{\sigma}_{t}^{2}}\right) \end{array} \]

    Other GARCH-type Models

    Two types of extensions: asymmetry in the impact of positive/negative lagged returns (leverage effects); allowing power in the volatility calculation.

  • Leverage effects and asymmetry
  • Leverage effect: volatility tends to rise following bad news and fall following good news. The leverage effect is not easily detectable in stock indices and is not expected to be significant in foreign exchange.

        \[ \log\sigma_{t}^{2}=\omega+\sum_{i=1}^{p}\alpha_{i}Z_{t}+\sum_{i=1}^{p}\lambda\left(\mid Z_{t}\mid-E\left(\mid Z_{t}\mid\right)\right)+\sum_{j=1}^{q}\beta_{j}\log\sigma_{t-j}^{2} \]


        \[ \sigma_{t}^{2}=\omega+\sum_{i=1}^{p}\alpha_{i}\varepsilon_{t-i}^{2}+\sum_{i=1}^{p}\lambda_{i}I_{\{\varepsilon_{t-i}<0\}}\varepsilon_{t-i}^{2}+\sum_{j=1}^{q}\beta_{j}\sigma_{t-j}^{2} \]

  • Power GARCH models
  • APARCH: combines these two effects in the same model
  • APARCH is introduced by Ding, Granger and Engle (1993) and embedded (G)ARCH, GJR-GARCH, TS-GARCH, T-ARCH, N-ARCH and log-ARCH as special cases. It allows for leverage effects when \zeta\neq0 and power effects when \delta\neq 2.

        \[ \sigma_{t}^{\delta}=\omega+\sum_{i=1}^{p}\alpha_{i}\left(\mid Y_{t-i}\mid-\zeta_{i}Y_{t-i}\right)^{\delta}+\sum_{j=1}^{q}\beta_{j}\sigma_{t-j}^{\delta} \]

  • ARCH Model of Engle when \delta=2, \zeta_i=0, \beta_j=0
  • GARCH Model of Bollerslev when \delta=2, \zeta_i=0
  • TS-GARCH Model of Taylor and Schwert when \delta=1, \zeta_i=0
  • GJR-GARCH Model of Glosten, Jagannathan, and Runkle when \delta=2
  • T-ARCH Model of Zakoian when \delta=1
  • N-ARCH Model of Higgens and Bera when \zeta_i=0, \beta_j=0
  • Log-ARCH Model of Geweke and Pentula when \delta \rightarrow \infty
  • Stochastic volatility

    The volatility process is a function of an exogenous shock as well as past volatilities, so the process \sigma_t is itself random, with an innovation term that is not known at time t:

        \[ \begin{array}{rcl} Y_{t} & = & Z_{t}\sigma_{t}\\ Z_{t} & \sim & N\left(0,1\right)\\ \sigma_{t}^{2} & = & \exp\left(\delta_{0}+\delta_{1}\log\sigma_{t-1}^{2}+\delta_{3}\eta_{t}\right) \end{array} \]

    where the distribution of shocks is

        \[ \left(\begin{array}{c} Z_{t}\\ \eta_{t} \end{array}\right)\sim N\left(0,\left(\begin{array}{cc} 1 & \zeta\\ \zeta & 1 \end{array}\right)\right) \]

    The SV model has two innovation terms: Z_t for the return itself and  \eta_t for the conditional variance of the return.

    Implied volatility

    By taking the actual transaction prices of options traded on the market and using the Black–Scholes equation to back out the volatility that implied the option price.

    Pros: based on current market prices rather than historical data: “forward-looking” estimators of volatility.

    Cons: rely on the accuracy of the BS model: imply the assumption of constant conditional volatility and normal innovations \rightarrow volatility smile/smirk observed in the option market.

    Realized volatility

    IS what actually happened in the past and is based on taking intraday data, sampled at regular intervals (e.g., every 10 minutes), and using the data to obtain the covariance matrix.

    Pros: is purely data driven and does not rely on parametric models.

    Cons: intraday data need to be available.






    You know you have been studying abroad in UK when….

    1. One out of 4 words you hear in the streets is "fuck" or "fucking"
    2. You have tried the symbol of British food, a breaded piece of fish with fries and they call it "fish & chips".
    3. You see semi-naked girls in the streets and boys wearing t-shirts with temperatures below zero.
    4. You are shocked to see that the Uni is closed, city is collapsed and people stranded if streets are covered with more that 5 cm of snow.
    5. You have travelled to London just for 1 pound with
    a fun fare, and you love it.
    6. You wake up every morning knowing that it’s quite unlikely that you’re going to see the sun.
    7. You drink pints every day and you love them
    8. You see people having a pee while they get money
    from a cash machine. (周六晚上,随处可见随地小便的人…=.=)
    9. You realize that dinner time is 6pm
    10. You see people drunk in the streets at 8pm.
    11. You see old people getting pissed in Potters Wheel (Wetherspoon)
    12. You are kicked out of a pub at 11.30 pm (平时的确差不多这个时间出门~)
    13. You have learned the difference between pasty and pastry and you’ve tried a Cornish Pasty.
    14. You see people wearing flipflops and shorts even though it’s raining.
    15. You’ve said "cheers mate" more than twice
    16. You’ve tried to buy a traditional coffee maker and you’ve failed.
    17. You realize the most important religion is not Christianity but Rugby. (Rugby is huge~!!!)
    18. You wonder how people wash their intimate parts without a "bidé"
    19. You wonder why the concept of "proper curtains" hasn’t arrived to this country yet.
    20. You hear and say "sorry" at least 10 times a day. (很有礼貌~喜欢~!)
    21. You’ve seen naked women on the second (and first, and third…) page of the daily newspapers. ( 英国小报 The Sun!!!! 以第三页全裸女出名)
    22. After a failed conversation with someone in the street you wonder whether he/she was speaking in Scottish, Gaelic, Welsh, Cornish, Irish or English. (哈哈….小小的一个国家,居然这么多种口音)
    23. You see Tesco as an important social meeting point.
    24. You have struggled trying to convert from Farenhait to Celcius, from Miles to Kilometers and from Pounds to Euros, but you know a pint is 0.56 litres.
    25. You have been driving on the wrong side of the road
    26. You have seen old people smiling at you in the street (常感到很温暖~)
    27. You have been asked for "some spare change" by an unknown person.(市民对乞丐很好呢,常常看到坐下跟乞丐聊天的路人)
    28. You see 3 kebab shops and 2 indian restaurants in every street. (讨厌吃Kebab…)
    29. You’ve had a Full English Breakfast with bacon, eggs, sausages, beans, etc and you think it’s amazing (it’s awesome!!!)
    30. You’ve had a burger, chips and beans on the same plate.
    31. You’ve thought more than ten times that the car you have just seen was driven by nobody
    32. You have tried to destroy the fire alarm at least a couple of times. (以前的flat,一炒菜就火警响,气S我了!)
    33. You have wondered about the wildlife present in your carpet.
    34. You see a group of people wearing fancy dresses every time you go out at night. (大爱fancy dress~他们也太有创意了吧)
    35. You have been in a pub next to a really drunk lady, that you think could even be your grandma. (见过一个年轻人和喝醉了的60岁+女人跳舞,kiss… 恶心死我了)
    36. You think you’re going to visit a palace, a castle or a chapel and you only see a few old stones. (高地太典型了!!!!!!!!都是些破石头啊… 但仍然很美)
    37. You realize that taking a cab is almost free (according to a certain person from Norway).
    38. You’re outside and don’t even notice it’s raining anymore, because it is just simply normal to you by now.
    39. You realise that any kind of food can be eaten with anything else, no matter how wierd the combination is.
    40. You have six months of holidays in a year.
    41. In case you need to get your hands clean, you realise that you only have two options: boil your hands in water near to 90º or see how they become two beauty ice-cubes. (家里的还好~餐厅的热水烫死了,冷水冰死人)
    42. You have a sink in your bedroom. (很方便亚~~)
    43. You can’t buy shoes in any shop because they all smell like feet!!
    44. You find machines in pubs in which you can buy condoms, vibrators, lubricant and even a Hair Straightener.
    45. Your house and surroundings are full of rubbish bags because rubbish is collected just once per week.
    46. You ask for a double whisky in a pub and the quantity you’re given is just ridiculous!!
    47. You see potatoes everywhere, in all different forms and shapes, i.e. boiled potatoes, jacked potatoes, smashed potatoes, chips, crisps, etc.
    48. You realize that burping in the library is something normal.
    49. You realize that no matter how weird the clothes you’re wearing are, people just won’t care.
    50. You have hoovered your room at least once. (每两天都会吸一次尘~)
    51. You shake the hand of someone of the opposite sex you’ve just met.
    52. You drink as much tea with milk as you drink beer (at least 5 times a day).
    53. You realize that being served alcohol in an academic seminar is completely normal.
    54. You learn that 4 cups of tea per day is good for you.
    55. You have stopped questioning why there are carpets even in the bathrooms
    56. You know there is a fair chance your house is filled with mould.
    57. Your floors and roofs are in serious decay after years of leakages and no maintenace.
    58. You have a fire exit in your house.
    59. You find yourself breaking into an english accent when trying to order a cuppa tea.
    60. You have mushrooms in your toilets.
    61. You see daffodils growing EVERYwhere, all year round. (真的好神奇哦,一年四季都会有,美的像假花~)
    62. You find yourself discussing what make of baked beans is the best…and it doesn’t scare you
    63. You see all four seasons in one day. first sun (oh blessed sun!), then rain, then snow, then hail. and sun, and rain, then…aaaah!
    64. "hello/hey, how are you?" is replaced by "you alright?" (典型英国口音亚…)
    65. You find yourself going out partying wearing only a little top… and it’s raining! And above all it’s normal because everybody is dressed like that!!
    66. You realize that burping in the middle of a lecture is something normal.
    67. It’s only five and every single shop is closed! (最让我的生气的啦!这么早关门干嘛亚!)
    68. You’ve bought something at Argos!! (Argos 刚到的时候也常用~很方便呢)
    69. You think it’s normal to sleep on a mattress which was considered old-fashioned crap in Europe 30 years ago.
    70. You don’t go out to go out but to get drunk. (Get drunk~!!!!!)
    71. You don’t mind the food anymore… (看来英国没有好吃的,已经出名了~)
    72. Subway is the healthiest meal you can think of
    73. You think that having a dildo is mandatory for every woman, and that ann summers rocks your sexual life! (Ann Summers Rocks~!!!)
    74. You find normal that in clubs the ladies are full of screaming semi-naked drunk (British) girls trying to do their make up and hair again and again.
    75. You feel like being a nun when you wear trousers or skirt longer than your knees and tops to go out (可惜苏格兰很多人还穿牛仔裤去pub.. 还是英格兰的night out 精彩啊..)
    76. You go to the lectures just for sleeping..lying on the table, doesnt matter!!!
    77. You discover that a simple ticket of the train can vary from a price of 8£ to 30£.. for the same train, time and journey
    78. You realize that you have never seen an English Restaurant (哈哈哈。。so true~!)
    79. You move into a house and realise that you can’t open the windows!!
    80. You’re in the top back part of the bus, and a 9 years old chav asks you for a lighter (9 years old 的 chav 还曾经 hitting on me 呢… 最讨厌英国的Chav 啦!)
    81. You realize that British people are queuing politely everywhere except at the bar counter
    82. You discover there is a "potato" function on the microwave!!!
    83. You phone a Hospital emergency service at night and you are speaking to a non-medical person on duty who will ask you a lot of questions and then decide if its an emergency. This person will even ask to speak to the almost unconcious patient and ask you to describe whether the person looks pale, the eyes are yellow, blue, red.Any bleeding…blah blah and then tell you that a doctor will only be available at 9.00 in the morning…(after an hour of questioning) and you are worried that the patient might die in the meantime but you have no other options (嗯,英国的医疗系统的确很麻烦!!! )
    84. Your umbrellas have got broken at least twice and you are still hoping not to break the new one even if it’s May! (风太大了~实在不想淋雨就买雨衣吧~伞确实很快就被吹坏了~ =。=)
    85. You see your housemate ordering chinese food or pizzas three times a week (哈哈。。 totally)
    86. You realize that you can get decent (dark, rye, healthy) bread in every European country except for the UK…and no, Toast is not considered a proper kind of bread…..
    87. You are no longer suprised to see fans and radiators on at the same time (either in February or June!)
    88. You are certainly annoyed by their stupid sockets
    89. You realize that every product you buy "may contain trace of nuts"
    90. Your sentences begin with.."to be honest".. (to be honest~ LOOOOOL)
    91. You are addressed as "treacle, sugarplumb, darling, sweetheart, love, …." (and all other versions of nicknames in that genre you normally only call your wife/lover) by the staff in supermarkets, pubs and restaurants. (英格兰的老人,很爱说‘love’~~~  you alright, love? =.=  )
    92. You are affected by CCTV paranoia. (CCTV is every fxxking where!)
    93. You can see, on a saturday night, Dancing on ice, strictly come dancing, pop idol, x factor, big brother, big brother celebrity, I’m a celebrity get me out of here (and so on) simultaneously!
    94. You are not surprised to see an old lady, her daughter and her granddaughter dancing together in a club.
    95. You talk about the weather all the time. (我也很爱谈论天气了~ it’s been sunny three days in a row~!!!!!!! 太不寻常了~ 哈哈)
    96. You hear "WHA" instead of W-H-A-T ! and "THA" instead of T-H-A-T!!! (it annoys me so much!!!! )
    97. You have asked to borrow ten "quid" instead of ten pounds from someone (quid=pound)
    98. It is 23.45 and the bell rings in the pub. Last orders mate, lets have 2pints each…
    99. You have to pull a string to switch on the light or get the water from the shower!!
    100. You realize "taking the piss out" of someone is not a medical procedure
    101. You realize everybody just gets crazy in a club when Dj plays Mr. Brightside (The Killers), Place your hands (Reef), Don´t Stop me Now (Queen)!!LOL or the Baywatch theme…
    102. You have to mind the gap between the train and the platform.
    103. Every door is a "fire door" that you have to "keep shut".
    104. You start celebrating Christmas Time right after Halloween.(感同身受亚~ 从十月份开始,party 就多起来了~ 狂欢一直延续到新年~ lol)









    这是个伤感的梦。在梦中妈妈帮我搬家去伦敦,搬完后我去外办些事,妈妈就留在家里收拾东西。傍晚我忙完事,满怀着兴奋的心情,心想终于搬了新家而且有妈妈在身边,晚上回去好好吃一顿庆祝一下。我打开门,“妈~ 妈~”,却没人回答。客厅里没人,厨房里没人,我四处寻找,却在书桌上看到一封信,是妈妈留的:



    How to add a Flash mp3 player on your blog?

    First of all, you need upload your mp3 to your server or skydrive. After you done it, get the URL, e.g.!688?filename=kite.mp3

    Put the below codes in where you want your mp3 player is and REPLACE the mp3 URL to the correct one. It works fine wherever you host your blog, blogger, wordpress, typepad or your own server. (p.s. replace ( ) to < >)

    (object id="audioplayer1" classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="290" height="24"; codebase=",0,40,0")(param name="data" value="" /)(param name="src" value="" /)(param name="flashvars" value="playerID=1&soundFile=!688?filename=kite.mp3" /)(embed id="audioplayer1" type="application/x-shockwave-flash" width="290" height="24"; src="" flashvars="playerID=1&soundFile=!688?filename=kite.mp3" data="")(/embed)(/object)

    You can also change the player’s size by modifying width="290" height="24".  Then you will get this

    It will be more convenient if you host your blog in wordpress. Add below codes in functions.php of your wordpress theme.

    Upload the mp3player.swf to your theme folder. After this, every time you want to add a mp3, you just need insert (mp3)MP3 URL(/mp3). Again, replace ( ) to [ ]

    For example, (mp3)!688?filename=kite.mp3(/mp3) will be converted to :[mp3]!688?filename=kite.mp3[/mp3]

    For auto play, use (mp3 auto="1")MP3 URL(/mp3)