A key modeling difficulty is that market volatility is not directly observable — unlike market prices it is a *latent variable*. Volatility must therefore be *inferred* by looking at how much market prices move.

We usually assume that mean return is zero. While this is obviously not correct, the daily mean is orders of magnitude smaller than volatility and therefore can **usually be safely ignored** for the purpose of volatility forecasting.

### Moving average (MA) model

The most obvious and easy way to forecast volatility is simply to calculate the sample standard error from a sample of returns. Over time, we would keep **the sample size constant**, and every day **add the newest return to the sample and drop the oldest**. This method is called the moving average (MA) model.

One key shortcoming of MA models is that observations are **equally weighted**. In practice, this method should not be used. It is very sensitive to the choice of estimation window length.

### Exponentially weighted moving average (EWMA) model a.k.a. RiskMetrics

The moving average model can be improved by assigning greater weights to more recent observations.

RiskMetrics is a branded EWMA by setting :

EWMA can be thought as a special case of GARCH(1,1):

**Cons**: is **constant** and **identical** for all assets.

**Pros**: 1) it can be implemented much more easily than most alternatives; 2) multivariate forms can be applied in a straightforward fashion. Coupled with the fact that it often gives reasonable forecasts, EWMA is often the method of choice.

### GARCH model and its extension models

Most volatility models are based on using returns that have been *de-meaned* (i.e., the unconditional mean has been subtracted from the returns). For random variables (RVs) , de-meaned means .

The innovation in returns is driven by random shocks where .

The return can then be indicated by:

##### ARCH

where is the number of lags.

One of the biggest problems with the ARCH model concerns the long lag lengths required to capture the impact of historical returns on current volatility.

##### GARCH

where and are the order of ARMA and GARCH respectively; and . The unconditional volatility of GARCH(1,1) is given by

The unconditional volatility will be infinite when and be undefined when .

##### Multiperiod volatility

To obtain the volatility n-days-ahead:

We can now derive two-step-ahead volatility:

If , the second term above goes to zero as , which implies that the longer the forecast horizon, the closer the forecast will get to unconditional variance. The smaller the quicker the predictability of the process subsides.

##### (G)ARCH in mean

The return on a risky security should be positively related to its risk. The conditional mean of a return, , is dependent on some function of its conditional variance or standard deviation:

where is the parameter describing the impact volatility has on the mean.

##### Maximum Likelihood Estimation

The nonlinear nature of the volatility models rules out estimation by standard linear regression methods such as OLS. The estimation is therefore using quasi-maximum likelihood (QML) approach.

Assuming the normal distribution, the density of the returns with GARCH(1,1) at is given by:

The joint density of is:

The log-likelihood function is then:

One way is to set to an arbitrary value, usually the sample variance of for large sample sizes.

When the data experience structural break, e.g. the 07-08 credit crunch, we will get very diﬀerent values of estimates if we set to unconditional volatility comparing with the case that we use EWMA to set initial volatility.

##### Goodness of fit tests

If models are nested, for example ARCH(1) against ARCH(4), one can form the LR test

where the number of restrictions is 3 in our case.

In out-of-sample forecast comparisons, it is often the case that the more parsimonious models perform better, even if a more ﬂexible model is signiﬁcantly better in sample. If the more ﬂexible model is not signiﬁcantly better in sample, it is very unlikely to do better out of sample.

Consider the normal ARCH(1) model. If the model is correct, the residuals are *iid* normal. So we can test the normality for the fitted or estimated residuals:

One can use Jarque–Bera test for normality and Ljung–Box test for autocorrelations.

Competing models can be ranked by goodness-of-ﬁt measures, such as mean squared error (MSE). But the conditional variance is not observable even ex post, and hence *volatility proxies*, , are required. The simplest volatility proxy is the squared return.

##### Other GARCH-type Models

Two types of extensions: asymmetry in the impact of positive/negative lagged returns (leverage eﬀects); allowing power in the volatility calculation.

Leverage eﬀect: volatility tends to rise following bad news and fall following good news. The leverage eﬀect is not easily detectable in stock indices and is not expected to be signiﬁcant in foreign exchange.

**EGARCH**:

**GJR-GARCH**:

**APARCH**: combines these two eﬀects in the same model

APARCH is introduced by Ding, Granger and Engle (1993) and embedded (G)ARCH, GJR-GARCH, TS-GARCH, T-ARCH, N-ARCH and log-ARCH as special cases. It allows for leverage eﬀects when and power eﬀects when .

### Stochastic volatility

The volatility process is a function of an exogenous shock as well as past volatilities, so the process is itself random, with an innovation term that is not known at time :

where the distribution of shocks is

The SV model has two innovation terms: for the return itself and for the conditional variance of the return.

### Implied volatility

By taking the actual transaction prices of options traded on the market and using the Black–Scholes equation to back out the volatility that implied the option price.

**Pros**: based on current market prices rather than historical data: “forward-looking” estimators of volatility.

**Cons**: rely on the accuracy of the BS model: imply the assumption of constant conditional volatility and normal innovations volatility smile/smirk observed in the option market.

### Realized volatility

IS what actually happened in the past and is based on taking intraday data, sampled at regular intervals (e.g., every 10 minutes), and using the data to obtain the covariance matrix.

**Pros**: is purely data driven and does not rely on parametric models.

**Cons**: intraday data need to be available.