## Constant Volatility over Small Intervals

Stock prices are often modeled by Geometric Brownian Motion. Each stock is assumed to have a volatility parameter that is roughly stable over time frames on the order of, say, a year. In practice, stock prices tend to change much more rapidly at the beginning and end of each trading day than they do in the middle. To analyze intra-day volatilities, we need to use a more general diffusion model that allows the volatility to depend on $t$. I will refer to a stock’s daily volatility pattern as its “volatility profile.”

In theory, one could compute volatility estimates over arbitrarily small time intervals. However, the more you zoom in, the less “GBM-like” stock prices are. For our analysis, we broke each trading day ($T=6.5$ hours) into seventy-eight five-minute ($l=5$ minutes) intervals. On day $i$, represent a stock price in terms of a standard Brownian Motion $W_i$ by

\begin{align*} U_i (t) = \exp \left(b_{i,0} + t \mu + \sigma(t) W_i(t) \right) \end{align*}

We define $n=T/l=78$ random variables, one for each interval:

\begin{align*} X_{i,j} &:= \log \left( \frac{U_i(j l)}{U_i((j-1)l)} \right)\\ &= \log U_i(j l) - \log U_i((j-1)l)\\ &= (b_{i,0} + j l \mu + \sigma(j l) W_i(j l)) - (b_{i,0} + (j-1) l \mu + \sigma((j-1)l) W_i((j-1)l))\\ &\approx l \mu + \sigma(j l - l/2) \left(W_i(j l) - W_i((j-1)l)\right) \qquad \text{by assumption explained below} \end{align*}

As a first attempt, we are assuming $\sigma(j l) \approx \sigma((j-1)l) \approx \sigma(j l - l/2)$, that is, volatility is nearly constant over small intervals. For a given stock, each $X_{i,j}$ is normal with mean $l \mu$ and variance approximately $l$ times $\sigma^2(j l - l/2)$ (henceforth abbreviated to $\sigma^2$), and they are independent of each other.

Assume we have data for $m$ trading days. The random variable

\begin{align*} Y_j := \sum_{i=1}^m(X_{i,j} - \bar{X}_j)^2/(l \sigma^2) \end{align*}

has an approximately $\chi^2_{m-1}$ distribution, so the standard unbiased estimator of $\sigma^2$ is $S^2/l$, where $S^2 := \sum(X_{i,j} - \bar{X}_j)^2/(m-1)$. The mean-squared error of this estimator is equal to its variance

$\newcommand{\V}{\text{Var}}$
\begin{align*} \V \frac{\sum(X_{i,j} - \bar{X}_j)^2}{l(m-1)} &\approx \frac{\sigma^4}{(m-1)^2} \V \frac{\sum(X_{i,j} - \bar{X}_j)^2}{l \sigma^2}\\ &= \frac{\sigma^4}{(m-1)^2} \V Y\\ &= \frac{\sigma^4}{(m-1)^2} 2(m-1) \qquad \text{because $\chi^2_r$ has variance $2r$}\\ &= \frac{2 \sigma^4}{m-1} \end{align*}

Refer to earlier posts for an analysis of this estimator, along with a comparison to other estimators. Yet another post describes how to estimate volatility rather than squared volatility.

Unfortunately, after glancing at some plots, the assumption of nearly constant volatility over five-minute intervals seems entirely untenable. Often, a volatility seems to change by a large proportion over the course of five minutes. Typical plots showing this can be found in the another article.

## Linearly Changing Volatility over Small Intervals

After realizing this problem, I devised a more plausible assumption: the change in volatility over any five-minute interval is approximately linear. Let $V(t)$ be a process whose natural logarithm is a diffusion with linearly changing $\sigma(t)$. That is,

\begin{align*} \log V(t) &= v_0 + \mu t + \int_0^t \sigma(\tau) d W(\tau)\\ &= v_0 + \mu t + \int_0^t (\sigma_0 + m\tau) d W(\tau) \end{align*}

Let us zero in on the complicated part of this expression by defining $R(t) := \int_0^t (\sigma_0 + m\tau) d W(\tau)$. At any given time $t$, the diffusion $R(t)$ is locally approximated by a Brownian Motion with volatility $\sigma_0 + mt$. In other words, $R(t+\delta) - R(t)$ should approach a $N(0, (\sigma_0 + mt)^2 \delta)$ distribution as $\delta \rightarrow 0$.

By transforming the time axis of a standard Brownian Motion in just the right way, we can produce a much more familiar-looking process with this same behavior. We need to find a transformation $D$ such that $W(D(t))$ has a “volatility” of $\sigma_0 + mt$ at any time $t$. A change in time of $\delta$ from time $t$ must produce a change in $D$ by $(\sigma_0 + mt)^2 \delta$. That is, $D$ satisfies $D(t+\delta) = D(t) + (\sigma_0 + mt)^2 \delta$ in the limit. Rearranging, and taking an anti-derivative, we find that a solution is $D(t) = \sigma_0^2 t + \sigma_0 m t^2 + m^2 t^3/3$. So our final result is

\begin{align*} W(\sigma_0^2 t + \sigma_0 m t^2 + m^2 t^3/3) \end{align*}

This diffusion has the same infinitesimal behavior as $R(t)$ at all times, so they are identical processes. The following simulations support this result. We will plot a diffusion on (0,1) with $\sigma(t) = 1-t$ (i.e. linear with $\sigma_0=1$ and $m=-1$). First, we generate the desired diffusion sequentially.

Next, we generate the same diffusion using standard Brownian Motion and the $D$ transformation discovered earlier. Note that the GenerateBM function invoked below can be found in an earlier post.

The resemblance between the two plots gives us some reassurance that our results are correct.

Now consider the distribution of $\log V(l)/V(0)$.

\begin{align*} \log V(l)/V(0) &= \log V(l) - \log V(0)\\ &= [v_0 + \mu l + W(\sigma_0^2 l + \sigma_0 m l^2 + m^2 l^3/3)] - [v_0]\\ &= \mu l + W(l [(\sigma_0 + ml/2)^2 + (m l)^2/12])\\ &\approx \mu l + W(l (\sigma_0 + ml/2)^2) \qquad \qquad \text{if $ml$ is small} \end{align*}

It is normally distributed with variance approximately $l (\sigma_0 + ml/2)^2$, assuming the product of $m$ and $l$, which is the change in $\sigma$ over the interval, is much smaller than one. Plots indicate that this assumption holds up quite well overall, though not perfectly; we could always use a smaller interval if necessary to make the assumption more plausible. Therefore, it is easy to see that estimating the variance from a sample of random variables with this distribution allows us to estimate $(\sigma_0 + ml/2)^2$, the squared volatility at the midpoint of the interval $(0,l)$. Likewise, it can be shown that the same process provides an estimate for the squared volatilities at the midpoints of the $n$ consecutive intervals of length $l$ from $(0,T)$.

The upshot is, we can still use the same estimator derived earlier to estimate the same quantities (midpoint $\sigma^2$ values), but now we have justified ourselves with a more believable model.