NOTE: The analysis described in this article is out of date. Since writing it, I have changed my mind about a few decisions and realized a few oversights. A much better analysis is available for download as a pdf. Let me warn you that it's about forty pages long.
On May 6, 2010, the U.S. stock market experienced what has come to be called the “flash crash,” in which the prices of various stocks fluctuated violently. Overall, the market rapidly plunged about nine percent and recovered minutes later. In respone, the SEC and stock exchanges devised the Single-Stock Circuit Breaker (SSCB) rules, which were implemented gradually. According to Traders Magazine,
The single-stock circuit breakers will pause trading in any component stock of the Russell 1000 or S&P 500 Index in the event that the price of that stock has moved 10 percent or more in the preceding five minutes. The pause generally will last five minutes, and is intended to give the markets a hiatus to attract trading interest at the last price, as well as to give traders time to think rationally.
It is important to note that the rules do not apply to the first fifteen minutes or the last fifteen minutes of each trading day.
Under the direction of Dr. Michael Kane, I am analyzing recent stock data. The aim of our research is to understand the effects of the SSCB rules. In particular, do the rules have any effect on the daily volatility profiles of stocks?
In my previous post, I describe how I processed the TAQ data and acquired market cap data. Now, for each stock I have a series of volatility estimates throughout the trading day over two distinct time periods: one period from 2010 before the SSCB policy and another period from 2011 after the SSCB rules were enacted.
The SSCB rules only apply to a subset of stocks: any stock in the Russell 1000 Index or the S&P 500. Unfortunately for our analysis, these stocks are systematically different from the average. They tend to be stocks of very large companies. This is obviously not ideal, but I will do my best to control for the systematic differences.
At this point, we have squared volatility profile estimates for 1690 stocks. 694 of these are in the SSCB group, while the remaining 996 comprise the non-SSCB group.
I started by working with the data from 2010. During this period, the SSCB rules were not in place yet. This step establishes a baseline. Did the two groups of stocks have similar volatility profiles before the rules were enacted?
Here is a typical example of a stock’s estimated squared volatility profile.
Spread Versus Level
It is clear from browsing more of these plots that the first and last points of the day tend to be the highest. They also seem to have the most variability. In fact, the following plot confirms that a strong relationship exists between spread and level. Times of the day with higher volatilities also have higher variation in their volatilities.
Generally, analyses go more smoothly if this relationship can be transformed away. A log transform does the trick.
Smoothing the Profiles
Clearly, we expect neighboring data points of a volatility profile to be very close to each other. In such cases as this, one can often improve the quality of one’s data by letting neighboring points “inform” each other. With a little thought and exploration, I found that a parametric fit worked nicely to smooth the volatility profiles. The 78 data points are well summarized by a fifth degree polynomial.
The residuals show no remaining structure in the data.
I apply this smoothing procedure to each stock.
Notice that the averages of the residuals do not show any discernable pattern either.
Controlling for Other Factors
Ultimately, we want to know whether differences in volatility profiles are related to a stock’s SSCB group status. But the SSCB group is systematically different from the other in a few ways. Two differences that I think are important are market cap (MC) and trading frequency (TF). Before comparing the groups, I will try to control for these factors.
First, let us see if market cap has any effect on volatility profiles by making some plots. There is a clear relationship between market cap and any given column of the volatility profile matrix.
Plotting the residuals, we see that no clear pattern remains. Therefore, I am satisfied that we have controlled for market cap.
In fact, the plots look basically the same for each column: there is always a curved negative relationship. A quadratic fit works well in every case. Therefore, I will use a quadratic fit to control each column for market cap.
Next, we will consider the effect of trading frequency on the columns. We must first remove the relationship between trading frequency and market cap.
The quadtratic fit seems good enough, so I will use it to control.
Now, I can look for a relationship between the columns and the trading frequency. Let’s look at a random column.
A quadratic fit works well.
Again, all of the columns give very similar plots.
Now, we need to repeat the smoothing and controlling process for the 2011 volatility data. The same steps work well in the 2011 case, too. I wrote a function to go through these steps for each year without making any plots.
Principal Component Analysis
It is still not obvious how we should compare the volatility profiles to each other. Each profile consists of 78 highly correlated values, and they all have a very similar shape to their curves. In other words, 78 values seems like overkill. Is there some simpler representation of each profile that still captures the variation among them? To me, this seems like a perfect chance to make use of principal component analysis (PCA).
You may note that the data smoothing actually reduced the volatilities profiles to six dimensions. Perhaps I should perform the PCA on those parameter values. My rationale for using the 78 estimated volatility values is that they are all on a comparable scale with each other.
First, I will run PCA on all of the data from both years together.
Indeed, over ninety three percent of the variability is concentrated in the first principal component. I will maintain the second principal component as well, largely because it allows for richer plots. These two principal components are a proxy for the volatility profiles.
Let us try to interpret these principal components by visualizing the first and second loadings. It seems that the first principal component is basically measuring the overall level of the volatility curves, emphasizing the middle of the day.
The second principal component acts as a measure of the difference between late and early volatility.
Below is a plot of the “landscape” of these two components, colored by group and year. Blue corresponds to non-SSCB, and red corresponds to SSCB. The lighter colored points are from 2010, while the darker ones are from 2011. The four means are also plotted in the appropriate colors.
Let us zoom in and just look at concentration ellipses to get a general idea the shapes and locations of the four groups.
The SSCB and non-SSCB stocks were different from each other in both periods. Furthermore, the difference looks quite similar. Nothing in this picture suggests an effect from the SSCB rules. But we still cannot rule it out.
One other notable observation is that volatility profiles in general are shifting over time. That might be an interesting topic to explore further at some point.
One more look at the data is worth pursuing. Let’s repeat the PCA on the differences (2011 values minus 2010 values).
This time, the first two principal components together comprise nearly ninety percent of the variability.
Again, the picture shows no evidence that the SSCB rules have changed the affected stocks’ volatility profiles in any important way. The two distributions look basically identical.
For each component, a t-test fails to detect a significant difference between groups at the .05 level.