In this part of the course, I will discuss the class of normal dynamic linear models for analyzing and forecasting non-stationary time series. We will talk about Bayesian inference and forecasting within this class of models and describe model building as well. I want to begin first with a motivating example. Suppose you have a model that is very simple and has no temporal structure here, just a model that looks like this. You have your time series yt. Then you're interested in just thinking about what is the mean level of that time series. That mean level, I'm going to call it mu and then I have some noise and the noise is normally distributed. They are all independent, identically distributed normal random variables 0 v. Again, I can think of my time series. Suppose that I have my time series here, and then I'm plotting yt. Then I have something that looks like this. In this model that mu is going to try to get the mean of that time series, this expected value of yt, which is mu. The variance here of yt is v under this model. What may happen in practice again, this model has no temporal structure, I may want to incorporate some temporal structure that says, well, I think that the level of this, the expected value of this time series, should be changing over time. If you were to do that, you will write down a model where the mu changes over time, so it's indexed in time. Then you have still your same noise here. Let's again assume normal 0 v. I have now to make a decision on how I'm going to incorporate temporal structure by modeling the changes over time in this parameter mu t. You could consider different options. The simplest possible, probably that you can consider is something that looks like this. You have that random walk type of structure where mu t is now going to be written as mu (t-1). The expected value of mu t, you'll think of it as mu(t-1) plus some noise. That error here is going to be again, assume normally distributed random variable centered at zero and with variance w. There is another assumption that we can make here and is that the nu t and omega t here, are also independent of each other. When I have this model, what am assuming here is that the mean level of the series is changing over time. These type of models have a few characteristics. This is an example of a normal dynamic linear model, as we will see later. In this models, we usually have a few things. The first thing is we have two equations. One is the so-called observation equation that is relating your yt, your observed process to some parameters in the model that are changing over time. The next equation is the so-called system level equation or evolution equation that tells me how that time varying parameter is going to be changing over time. The other thing you may notice is that we have a linear structure both in the observational level and in the system level. The linear structure, in the sense of the expected value of yt is just a linear function of that mu t. It happens to be mu t in this particular case. In the second level, I can think of the expected value of mu t as a linear function given mu(t-1), so it's a function that is linear on mu(t-1). There is that linear structure. The other thing that we have here is at both levels, we have the assumption of normality for the noise terms in those equations. This is an example of a Gaussian or normal dynamic. These are time-varying parameters linear model. We will be discussing the general class of models. This is just an example. We will also discuss how to build different structures into the model, as well as how to perform Bayesian inference and forecasting. The general class of dynamic linear models can be written as follows. Again, we are going to have two equations. One is the so-called observation equation that relates the observations to the parameters in the model, and the notation we are going to use is as follows. Here, my observations are univariate. We are discussing models for univariate time series. I have that related to a vector of parameters, Theta_t plus some noise here. This is the noise. The noise are assumed to be independent, identically distributed normal random variables, 0, V_t. Then I have another equation which is a system equation that has this form. There is a general G_t matrix. This is going to be depending on Theta_(t-1). This is a vector, and then I have again, these are iid multivariate normal now 0, W_t. This is the observation equation. This is the system equation or evolution equation. This defines a normal dynamic linear model. Here, we are going to say that F_t is a vector. The dimension of the vector is going to be the same as the number of parameters in the model. Let's say we have k. This is a vector of known values. For each t, we are going to assume that we know what that vector is. Then we have the vector of parameters here is also of dimension k of parameters. The G is the next thing we need to define is a known matrix. That one is also assumed to be known, and then I have V_t is variance at the observational level. The W_t we are going to assume at the beginning that these two quantities are also known for all the values t. This is the variance-covariance matrix at the system level. Again, if we think about these two equations, we have the model defined in this way. There is a next piece that we need to consider if we are going to perform based in inference for the model parameters. The next piece that we need to consider to just fully specify the model is what is the prior distribution. In a normal dynamic linear model, the prior distribution is assumed to be conjugate here. In the case again in which V_t and W_t are known, we are going to be assuming that, say that zero, the parameter vector before observing any data is going to be normally distributed. Multivariate normal with M_0 and C_0. The mean is a vector, again of the same dimension as Theta 0. Then I have k by k covariance matrix there as well. These are assumed to be also given to move forward with the model. In terms of the inference, there are different kinds of densities and quantities that we are interested in. One of the distributions that we are interested in finding is the so-called filtering distribution. We may be interested here in finding what is the density of Theta t given all the observations that we have up to time t. I'm going to call and all the information that I have up to time t. I'm going to call that D_t. It can also be, in some cases, I will just write down. So D_t, you can view with all the info up to time t. Usually, it is all the information I have at time zero. Then coupled, if there is no additional information that's going to be coupled with all the data points I have up to that time. Here I'm conditioning on all the observed quantities and the prior information up to time t, and I may be interested in just finding what is the distribution for Theta t. This is called filtering. Another quantity that is very important in time series analysis is forecasting. I may be interested in just what is the density, the distribution of yt plus h? Again, the number of steps ahead here, here I'm thinking of h, given that I have all this information up to time t. I'm interested in predictions here. We will be talking about forecasting. Then another important quantity or an important set of distributions is what we call the smoothing distribution. Usually, you have a time series, when you get your data, you observe, I don't know, 300 data points. As you go with the filtering, you are going to start from zero all the way to 300 and you're going to update these filtering distributions as you go and move forward. But then you may want to revisit your parameter at time 10, for example, given that you now have observed all these 300 observations. In that case, you're interested in densities that are of the form. Let's say that you observe capital T in your process and now you are going to revisit that density for Theta t. This is now in the past. Here we assume that t is smaller than capital T. This is called smoothing. So you have more observation once you have seen the data. We will talk about how to perform Bayesian inference to obtain all these distributions under this model setting. In addition to all the structure that we described before and all the densities that we are interested in finding, we also have as usual, the so-called forecast function, which is just instead of being the density is just expected value of y(t+h) given all the information I have up to time t. In the case of a general normal dynamic linear model, we have the structure for these just using the equations, the observation and the system of equations. We're going to have here G_(t+h_. We multiply all these all the way to G_(t+1), and then we have the expected value of Theta_t given Dt. This is the form of the forecast function. There are particular cases and particular models that we will be discussing in which the Ft is equal to F, so is constant for all t and G_t is also constant for all t. In those cases, the forecast function can be simplified and written as F transpose G to the power of h expected value. One thing that we will learn is that the eigenstructure of this matrix is very important to define the form of the forecast function, and it's very important for model building and for adding components into your model. Finally, just in terms of short notation, we can always write down when we're working with normal dynamic linear models, we may be referring to the model instead of writing the two equations, the system and the observation equation. I can just write all the components that define my model. This fully specifies the model in terms of the two equations. If I know what Ft is, what Gt is, what Vt is, and the covariance at the system level. I sometimes will be just talking about just a short notation like this for defining the model.