Now, we have an understanding of what the time series is. We have an understanding of what type of applications or what type of tasks, what type of operations we will like to do on using and with time series. We also have some understanding of high-level behaviors of time series or high-level characterization of the time series. In the next few slides, we will get into the deep and I will discuss, can we actually have more fine-grained models of time series? That is, can we start developing closed-form formulas for the time series? Then, after that, we'll try to see, can we actually use these closed-form formulas to make forecast and to make decisions using time series. Perfect. So, let's basically get into the details. So, so far, we have only two type of models for time series: Stationary, non-stationary. Now, we will actually go into details and we'll define much more fine-grained model for a time series. So the first model for the time series that we will learn is called the random time series model. Random time series model. In the random time series model, the current observations that you are having at a given point in time, say today, does not depend on the past, and they only reflect the current random errors. In general, basic random mean, when we use the term random error, we essentially refer to some process, some stochastic process that does not depend on the past. That only depend, for example, on an external input to the system. So this is in a sense mathematically shown as follows. The value of the time series at the current time instance X_t, X is the value of the time series, t is the time. The value of the time series at the current time is determined by some error at that time or external value at the time which has some its own statistical properties. Usually, a common technique is to assume that the error is Gaussian, it show a Gaussian, with a zero mean and with some variance. This basically usually makes things simpler to analyze, making that assumption makes things analyze. Essentially, to repeat, what they are saying is that if we take a look at the time and if we take a look at the current value at time t for a time series, the value it's going to take does not depend on the past. So whatever the value is in the past for the time series, it's not going to tell us anything about its current value. Now, it's understood that many things are unsure of that. For many things in the real world, there is some dependency on the past. What happened yesterday affects what will happen today. If the stock market has some value yesterday, you can predict it's not going to have some arbitrarily different value today unless there is some big external event. So, most things in the real world are not random time series models. But it's important to understand in model random time series as well, we'll see that in the real world most of the time series that we have show certain dependency to the past but you also need to account for certain random or external effects. The random time series essentially enables us to quantify and take into account random external effect on the system. Great. So now we know what a random time series model is. So the second time series model that we will be discussing and we'll be learning is called the autoregressive time series model. Autoregressive time series model. In the case of autoregressive time series model, the current value of the time series not only depends on a random component, as in the random model, but it also depends on the past. In fact, it essentially depends with some contribution to the immediate past. What this is saying is that if I look at the time and if I look at the current value, the current value that I'm going to observe is a combination of the immediate past value plus weighted with some contribution plus some random component. So essentially, what I'm saying is that the real world data, in this case, depends on the path with some value, with some contribution, plus there is some external contributor. So this type of models, while we claim that the current observation depends on the past, are named autoregressive models. They are usually, the closed-form formula for them usually have this shape. In this case, since the time series only depends on the immediately previous time instance, we call this the autoregressive one model or AR(1) model. The value of one here indicates that the current value depends only on the immediately past observation. Now, there are more extensive autoregressive model says that this is actually simplest autoregressive model. More complex autoregressive model, for example AR(2) model, says that the current observation is a combination of a random component once again, say contribution of a random component. It also depends on what happened immediately before the current observation, but it also depends to some degree what happened to time instance back. So, to predict what is going to happen today, I need to account for what happened yesterday, what happened the day before plus the random event that may happen. So this is called the AR(2) model. Again, it's an autoregressive model; however, it is more complex than AR(1) because whatever will happen today, what is happening today depends on not only yesterday but also the day before. Note that when they say understanding time series model, what we mean is that we need to be able to understand. Given the time series we need to be able to understand first or whether it's an autoregressive model. If it's an autoregressive model, we need to understand whether it's an AR(1) model, whether it's an AR(2) model or maybe it's an AR(3) model, maybe it's an AR(5) model. For each one of them, we also need to be able to discover what are the corresponding Alpha values. How strongly what happens today depend on yesterday? How strongly what happens today depends on what happened the day before and so on and so on? So that's really what we mean by discovering a model for a time series. Now, autoregressive models are not the only models that are available beyond the random models. A second type of model that goes beyond the random model is called the moving average model. The moving average model. In the case of the moving average model, what is happening today once again depends on what's happening right now. Randomly, the external input to the system. It also depends on what happened yesterday, but it doesn't depend on the actual observation yesterday, but it depends on the random event that happened yesterday. So, the difference. Please, pay attention. The difference here is that in the AR(1) model, X_t depend on X_t minus one, Xt dependent on X_t minus one. In the moving average model however, X_t depends on E_t minus one. Remember, E is the external input. So the moving average models essentially account for the contribution of the random external events and their contribution to the current observation. Now, the way the AR models have different versions AR1, AR2 so on, the moving average MA models also have MA1 MA2 and so on. In the MA2 model what we say is that the current observation depends on the current random events, yesterday's random events, and the day before's random events. So these type of time series are called Moving Average Time Series Models. Of course in the real world usually you don't really have just autoregressive models, and just Moving Average Models. In the real world you have a mixture of these. Usually the time series show some autoregressive behaviour, and some moving average behavior. Because of that usually we talk about ARMA models, A-R-M-A models. The ARMA or A-R-M-A models essentially show both autoregressive behavior, and moving average behavior, on top of the random behavior currently. Usually basically when we talk about ARMA models, we usually give two parameters to the ARMA models, ARMA AM. A essentially is the number of autoregressive terms that the model has. M usually is the number of moving average term the model has. Obviously, the more terms we add, the more precise our model may become, but it also may become more difficult to discover, and it may become more complex and less easy to use. Consequently usual when we discover models and which we will discuss this later, we usually seek models that fit the data well, that represent the data well, but they are also less complex. We usually don't want to have very large A's and very large M values. Those type of models become too difficult to discover accurately, and be to properly use and interpret. We will discuss that later, we will discuss how we are discovering this ARMA models later. Now one shortcoming of the ARMA models, is that the ARMA models themselves assume that the time is constant or assume that the speed of the events are constant. They don't necessarily take into account the speed of change or the degree of acceleration of the time series, which as yet seen before maybe important to characterize certain type of time series. In many time-series, you need to account for the speed of change or degree of acceleration of the events, and ARMA series do not do that effectively. So, usually to be able to overcome this behavior, the speed of change or degree of acceleration, we introduce models with what we call differencing. Differencing essentially means the following: instead of trying to characterize the current observation in terms of the past observations, we'll take the difference between the current observation, and the previous observation, in this case x t1 essentially the difference between X t and X t -1, and we'll try to characterize that. This is called the order one differencing. Unusual order one differencing enables you take into account speed of change, because if you take a look to this, essentially what X t is capturing is the change between two time instances, the speed with which the value of change between two time instances. That's called Order one differencing. The order two differencing takes the difference of the speed of change, which is essentially the degree of acceleration. This is the second order difference. Second order, you can think of second order derivative of the curve, and essentially that represent the degree of acceleration, no matter how fast the speed changes, which is the degree of acceleration. So if we do Order two differencing on the time series, the resulting series in this case X t2, would represent how the degree of acceleration of the time-series change over time. If you basically create these terms, we can talk about speed of the time series, and we can also talk about acceleration of time series. So these are good for now we have actually strong, we have the ability to account for speed, acceleration, and even change in acceleration. If you basically do further differencing, we can also talk about changes in acceleration and so on of the time series. These parameters or these kind of behaviors of the time series cannot be taken into account unless you implement differencing. So the differencing essentially gives us the power to talk about things like speed and acceleration. So, this brings us to more and more complex, and more power of model, that is known as the ARIMA model for time series. ARIMA model essentially has three components. Once again, you have an AR component, autoregressive component, that describes the current observation in terms of the past observations. In this case in terms of A past observations. It also has MA model, Moving Average Model as part of the components. So this is called the Moving Average Model, and in this case we have M components, that describe the time series in terms of moving average. But they also have the differencing terms in the model. So that is the third component that is newly introduced, which is the differencing terms. We are describing what is happening right now in terms of the speed that we were observing, or the acceleration that we're observing and so on. So, the ARIMA models are very powerful. They take into account random external events, they take into account autoregressive behaviors, and they also take into account speed and acceleration when predicting or when describing the current value.