Now, we come to the second test, to see if the ordering of our data is important. In this test, we are looking for volatility clustering. Researchers have observed that in many financial markets, high volatility days tend to be followed by more high volatility days. They also see that low volatility days tend to be followed by low volatility days. This is the phenomenon of volatility clustering. The idea of volatility clustering started with a paper by Robert Engle in 1982. In this paper, Professor Engle proposed a statistical model that he called ARCH, which stands for Autoregressive Conditional Heteroskedasticity. This is quite a mouthful, isn't it? All right. Let me explain these terms. Autoregressive is a term in statistics that refer to the past data affecting future data in a time series. Conditional heteroskedasticity is just a technical term for volatility clustering. Now, Professor Engle's student, Tim Bollerslev generalize the ARCH model and called it GARCH, which stands for Generalized Autoregressive Conditional Heteroskedasticity. We will be working with the GARCH model in a little while. Now, Professor Engle's idea has been so widely used in modeling financial time series, that he was awarded half of the 2003 Nobel Prize in Economics. The other half of the prize went to Professor Clive Granger. Here is a quote from the Nobel Prize committee's website. It's specifically recognizes ARCH, as the reason for awarding the Nobel Prize to Professor Engle. As a personal note, I was a graduate student in the economics department at MIT, and Professor Engle was a visiting as a visiting professor in one year, and I took a course from him. That was before he published his famous paper. After I graduated from MIT in 1981, I read his paper on ARCH in 1982, and Professor Bollerslev's paper in 1986. I decided to use the ARCH and GARCH models in my own research. So I'm very familiar with these two models. Before going too far ahead of ourselves, let us see what is the evidence on volatility clustering. Do you remember how we test it for the serial correlation of a time series? We're going to do something that is very similar here to test for volatility clustering. So let x sub t denote the value of a time series on day t. Think of this again as the daily log return of the Wilshire 5000 Index. Now, we take the absolute value of the time series of x sub t. Then calculate the autocorrelation coefficient, now of the absolute value of x sub t, in the same way that we calculated the autocorrelation coefficients of the original series. Apply the ACF function to the absolute values of log return. Again, this will graph the autocorrelation function of the absolute values of log returns. Again, remember that the zeroth order autocorrelation coefficient is always one. Here is the graph of the autocorrelation coefficients of the absolute values of log returns. It looks very different than the corresponding ACF graph for the log returns themselves without the absolute value. Every autocorrelation coefficient is positive and outside the 95 percent confidence band. So what does this tell us? This pattern of autocorrelation coefficients tell us that large returns whether positive or negative tend to be followed by large returns. This is what happens when the data have clusters of volatility. Volatility clustering tells us that the ordering of the data is important. So let me show you why this is so. I'm going to take the absolute values of the log return series, and randomly permute it or reorder it. This is just like shuffling a deck of cards. The cards are the same, but the ordering of the cards is randomized. In the same way, I shuffled the ordering of the log return data. Now, I apply the ACF function to the shuffled data. As you can see in the graph, the autocorrelation coefficients of the absolute values are close to zero. Remember, I have the same exact log return series. I just randomly reordered them, volatility clustering disappeared. This clearly demonstrates that the ordering of the data matters. To show this in another way, here are the graphs of the original log return data on the left, it has volatility clustering. Now, look at the graph on the right. It is the same log returns, but they are randomly reordered. It shows no volatility clustering. So here is the takeaway, the daily log returns of the Wilshire 5000 index shows strong volatility clustering. Does means volatility of stock prices are predictable? Now, it is your turn to check for volatility clustering in the data you downloaded from Fred.