Hi, everyone. Welcome to operations analytics. I'm very excited to have you in the course. I'm Senthil Veeraraghavan. I'm an associate professor in the Department of Operations, Information and Decisions at The Wharton School. I'll be introducing you to the Operations Analytics course. In week one, you're going to focus on descriptive analytics. Uncertainty about future events is a key feature of the world that we live in. Uncertainty plays a big role in many business decisions. Operations is often about making good decisions in such uncertain settings. First, I'm going to introduce you to one of the most central problems in operations. A problem of matching demand with supply in uncertain settings, called a newsvendor problem. We need to be able to describe uncertainty in our data in order to make better decisions. Hence we're going to go over methods to forecast future outcomes. We're going to learn how to think about forecasting when there's trend or seasonal variation. The descriptive analytics that we will cover in this week, as you will see, will provide a strong conceptual basis for predictive and prescriptive analytics. In the next three weeks, you'll learn how to use the analytics toolkits to evaluate different courses of action, to optimize, and to choose the best possible action. Once again, I'm very excited to have you here. Hello, welcome to Operations Analytics. This is the first week of Operations Analytics. Where we are going to be covering descriptive analytics. In the descriptive analytics, I'm going to be covering over four sessions, here is how we're going to split the material we're going to cover in four sessions. In the first week, in the first session, I'm going to talk about an operational decision problem called the newsvendor problem. After I introduce the newsvendor problem I'm going to talk about random variables, and I'm going to talk about demand distributions. And that leads us to session two. In session two I'm going to be talking about forecasting with past historical data. Then I will talk about moving averages and I will talk about exponential smoothing in the advanced material. In session three, I'm going to be talking about forecasting when the data shows trend or seasonality. Finally, in session four, I'm going to be talking about forecasting for new products and about fitting demand distributions. And these are the four sessions we will be covering in the first week of our course. And the first week, again, will focus on descriptive analytics. Now let's jump right into session one, where I'm going to talk about an interesting operational decision problem. Before we dive into analyzing data, let's take a look at a fundamental problem that firms face. It's an operations problem on how much to produce. To answer this question, we need to know our estimate, the cost of the product, the price of the product, and some data on the demand of the product. Let's explore a problem to get started. Here's a fundamental problem in operations, I'll give you an example. Suppose that you're making operations decisions for a retailer who orders a product from a supplier and sells it to customers. The ordered product items are received and placed on a store shelf. Suppose there is a large customer population and each customer in the population may choose to buy or not buy the product. If the customer chooses to buy, he arrives at the store to buy the product. He buys it as long as it is available on the shelf. However you have to order the product before you see the customer demand, since you have to have the items available on the shelf. Suppose that you get only one chance to order. That is, you cannot really change your purchase order after you make your decision. Let's take a look at the costs in that operations problem. You order the product from the supplier at some cost, 3 talers per item. Talers are just some currency units that we're going to use for this example. After your order is received and placed on the shelves, some demand occurs. The product on the shelf sells at price 12 talers an item. All unsold items at the end of the season, or at the end of the day, are salvaged. You get no money from salvaging. That is, the salvage value is zero talers per item. You lose all the money if you buy and don't sell it. Now let's look at a timeline of events to understand the problem better. Here's the timeline of events. This is what happens in every period. You submit an order to your supplier and the cost of purchase is three talers an item. You receive all the ordered items. So, whatever you order, you will receive it. You receive these items almost immediately. Very small time window passes before you receive them. And you store them and you shelve them immediately. Once you shelve them, there's some uncertain demand. Customers choose to come to the store and as they come, they see the item on the shelf and they buy it as long as it's available. Lets say the selling price of the item is 12 talers an item. And the key factor here is, the demand is uncertain. So you really do not know how many customers are exactly going to turn up at the store and buy an item. If you sell all the items that you've bought, good. But sometimes if you have leftover items that remain unsolved, they have to be salvaged. In this example, we're going to assume the salvage value is zero talers an item. So whatever's left is given away and you lose all the money. That's it. The problem ends, and then the next period you have to make an order and meet the demand in the next period, and so on and so forth. Let's focus on an important point I mentioned before, the demand is uncertain. That is, demand could be anything. It could be high, low, and so on. Suppose you bought ten items. And let's look at a high demand scenario. Let's suppose that the demand is 100. You will sell all 10 items. Even though the demand was 100, you could sell only 10 because that's all you got. You sell all 10 items and make a profit on all those 10 items. 10 times 12 minus 3. You bought it at 3 and you sold it at 12 so 10 times 12 minus 3 is 90 talors. That's your profit. Even though your demand was 100 you sold only 10 and that's the profit you made. There is also a low demand scenario. A demand scenario where, suppose there is no demand. In this case, you bought the items and the demand is nothing. So you sell nothing and loose all the money in buying those items, because you bought them at three talers, the ten items. And therefore, you loose 30 talers. This is because there is no salvage value for items that are leftover. With this timeline in mind, let's look at the problem. Based on the information before, let's recap the problem. To recap, you don't know what the demand is going to be and you have to decide on the number of units to order from the supplier before seeing the customer demand. In this case, what could help? Past demand could be helpful? Fortunately, we have demand data from the past 100 periods. Here's the past demand information. In the graph that you see, you see demands that were observed in the past 100 periods. As you can notice, there is a lot of variability in demand. In the first period in the observations, you see the demand was 29. In the last period in the observations, you see the demand was 41. Let's understand the demand patterns more. Here's some more information about the past demand data. From the observations over the past 100 such periods, you see that the maximum demand observed was 81 and the minimum demand observed was 15. You can even calculate the arithmetic average of those 100 observations, and that is 52.8. Based on the data, I'm going to ask you to go through an exercise, an exercise on deciding how much to order. Before you make a decision, let's go through the following points. First, there is no penalty for a wrong answer. Or conversely, no extra course credit for the right answer. It's honor-based, but you get one attempt at making your decision. The objective of the exercise is not to test you or to grade you, but to set an initial baseline thinking about how to think about these problems as we start the course. So, I ask you the following. Write down your answer on a sheet of paper or a Post-it note and keep the sheet or the note throughout the course. We will see the best answer in the course and you will then get a chance to compare your answers, and calibrate the learning progress. How much would you order? That is the question. Suppose you're a manager contemplating the question of how many items to order from the supplier. Choose the quantity Q that you will order. Once you select Q, the market will produce 50 random demand instances from the distribution of demand similar to the figure I showed you. Each random demand instance will correspond to the demand value you may face in the coming selling season. Your objective is to select Q to maximize total profit that you will earn when faced with these 50 random demand values. Now, take a moment and write down your answer on a sheet of paper or a Post-it note. Once you have written down the answer, we are now ready to move to the next slide. The problem you just saw is called a newsvendor problem. Its characteristics are the following. You have an objective. Usually maximize profit, minimize cost or improve market share, etc. You have to make one decision, usually how much to buy and how much to plan for. And this happens before you see the future demand, then demand occurs, profits and costs are realized. This is called a newsvendor problem, because it is similar to a vendor who sells newspapers. You buy too much and you may be left with unsold newspapers or you buy too little, and you'll forgo revenue opportunity. In this course, we will show you how to think about this problem and how to analyze this problem. Now, I'm going to show you an application of the newsvendor problem at Time Magazine who's the news lined up. In the Time Magazine supply chain, they have the following problem. The stores were either selling out inventories which means they had too little inventory or they sold only a small fraction of the allocation, which means they had too much inventory. So, this is a news vendor problem. Time Magazine evaluated and adjusted for every issue they printed, the following. The national print order, which is the total number of copies printed and shipped. Two, the wholesale allotment structure, which is how those copies were allocated to different wholesalers. Three, the store distribution, which is the final distribution of the magazines to the stores. Note the about three decisions are made before the actual demand for the weekly issue is realized. Therefore, they need to analyze past data and they have to be able to forecast future demand. In fact, Time Magazine reports saving about $3.5 million annually from tackling the newsvendor problem. This story is captured in a white paper by Koschat in Interfaces magazine in the year 2003, volume 33. Other than the Time Magazine newsvendor problem, let's look at some broad applications of the newsvendor problem. Here are some of them. Every year, governments order flu vaccines before the flu season begins and they make this decision before the extent or the nature of the flu strain is known. One question is how many vaccines to order? This is a newsvendor problem, because you have to know how to make your decision before the demand is known. Smartphone users buy mobile data plans before they know their actual future usage. In this case, what's the right plan for you? This is a newsvendor problem again, because you have to make a decision before future demands is on. Consumers buy healthcare insurance plans before they know their actual health expenditures. Again, how to think about the right plans? This is also a newsvendor problem. For the all above examples that we saw, some forecast of future demands is essential. Let's think about how to do this. It is essential to forecast future demand. So let's understand what forecasting is all about, what's forecasting? Forecasting, the primary function of forecasting is to predict the future. Why are we interested in predicting the future? Because it dictates the kind of decisions we make today. We know something about the future, we can make better decisions today. Who uses forecasting in their jobs? A lot of jobs use forecasting, typically, generally speaking, we forecast demand for products, we forecast demand for services, we forecast inventory needs, we forecast capacity needs daily and so on and so forth. But what makes a good forecast? First, forecasts should be timely, it should be reliable, it should be as accurate as possible. And it should be in meaningful units, the forecasting method should be easy to use and be understood in practice. Let's look at characteristics of forecasts. Point forecasts are usually wrong. In fact, this is the first rule of forecasting. Why? Let me give you a couple of examples. I forecast, in December 2015, there will be 37 centimeters of snow. I forecast we will sell 314 umbrellas during the rains next week. Very likely these forecasts will turn out to be wrong. For example, you could have 37.5 centimeters of snow in December. Or you could sell 317 umbrellas during the rains next week. Or you could deviate even more. This happens because the demand could be a random variable, it could deviate from your forecast. Therefore a good forecast should be more than a single number. Typically we provide mean and standard deviation. You could also provide a range, high and low for example. For example TV weather forecasts provide the high temperature tomorrow and the low temperature tomorrow. That's a way of providing more than a single number. We have to think about modeling uncertain future. Usually, we can model future through probability distributions. Let's think about this further. We often do not control purchasing behavior, as a result we cannot predict future demand with certainty. So, how do we describe uncertain future demand? We can try to decide what future demand scenarios are possible. And for each scenario, estimate the likelihood of its realization. So, where do scenarios come from? They could come from your past data or they could come from expert estimates. Let's look at an example of a model of future demand. Let's start by looking at a small number of scenarios. Let's say, three scenarios. Let's call them, "high demand" scenario, "ordinary demand" scenario and "low demand" scenario. Let's say that "high demand" scenario corresponds to a demand value of 80. "Ordinary demand" scenario to the value of 50, and "low demand" scenario to the value of 20. For each scenario, a likelihood of its occurring must be estimated. You know where example of model of future demand, we have to estimate how likely each scenario is. Where do the estimates of the likelihood come from? They come from statistical analysis of past data. Suppose that after analysing the past data, and using some subjective inputs, we estimated that the scenarios have the following likelihoods of being realized in the next selling season. Likelihood of high demand is 20%. Likelihood of normal or ordinary demand is 70%, and likelihood of low demand is 10%. In our scenario analysis, we project that the demand is not equal to a certain number with sure probability of 1, but rather can take any of the 3 values with the corresponding probabilities. In essence, we have just created a probability distribution for the future demand. Demand could be 80 with probability P1= 0.2. Demand could be 50 with probability P2= 0.7. Demand could be 20 with probability P3= 0.1. Probability distributions like that one I just described are described by a number of distinct scenarios. Each with attached probabilities. And such probability distributions are called discrete probability distributions. Finally, know that the probabilities are all greater than zero. 0.2, 0.7, 0.1 and so on, and they all add up to 1, that is 0.2 plus 0.7 plus 0.1 is equal to 1. In this slide, I show you how the probability distribution looks like. The scenarios I've shown here, 20, 50, and 80 and their corresponding probabilities are shown. Probability distributions are typically described by mean and standard deviation. For any probability distribution, even a simple one reflecting three demand scenarios which we just saw, two useful descriptive qualities are often calculated. Mean, which is also known as the expected value, and standard deviation. Let's try and describe them. For a discrete probability distribution, the mean is just defined as the sum of the products of scenario values and their probabilities. For our demand distribution, the mean represented by D bar will be P1D2 + P2D2 + P3D3 or 0.2 * 80 + 0.7 * 50 + 0.1 * 20 = 53. How do we interpret 53? Mean of 53 reflects the demand value that we will get on average in a selling season if we keep observing the demand realizations over an infinite number of selling seasons. In other words, if you keeping drawing observations or infinite selling seasons, the average value of your expectation should be 53. I showed you the distribution before. Now, on that graph, let me show you the mean. The red vertical line represents a mean of the expected value of the distribution. Now we will look at the standard deviation. Standard deviation describes roughly speaking, how far away The actual random variable values are from the mean, on average. In other words, in a colloquial sense, it describes how spread out your distribution is around its mean. How can we calculate standard deviation? Standard deviation is defined as the square root of the sum of following things, products of the scenario probabilities with the squares of the difference between scenario value and the mean value. Let me say that again. You take the scenario value under the mean value, take the difference, square the difference, multiply it by the scenario probability. Do this for every scenario. And then add them up, and then take the square root. For example, for the three scenario demand probability, we consider the Standard Deviation is calculated as follows. Take the difference between demand scenario and the average, square of that, multiply it by the probability, and do this for every scenario. Take the scenario demand It's difference with the average, square that, and multiply it by the corresponding probability and do this for every scenario. You get 0.2 times 20 minus 53 squared, 0.7 that's 50 minutes for 53 squared, 0.1 x (80- 53) the squared, then we get 16.6, you see a Standard Deviation. I showed you the graph of the probability distribution, in green before, I also showed you the mean, which is represented by the vertical red line, mean of 53. Now, I show you roughly how the standard deviation looks like. The standard deviation is a measure of how spread out your probability distribution is around the mean. The knowledge of mean and standard deviation values. Helps us to support a general intuition about the nature of the random variable... What if we have more than three scenarios? It is somewhat straight forward to do this, so let's think about the following N scenarios. Demand 1 with probability P1, demand 2 with probability P2, demand three with probability P3, and so on, up to demand n with probability pn. Now, all these probabilities are positive, and they all add up to 1, which is P1 +P2 + P3+ and so on and so forth, up to pn = 1. Now, how do we calculate the mean and standard deviation of this demand distribution with the n scenarios? It's again, a straightforward extension of the three scenario case. We take, for the mean, of the expected value, D bar. We calculate p1 times D1 + P2 x D2 + P3 times D3, and so on, up to p n times D n. For standard deviation, we calculate the difference between the scenario and the average (D1- D bar) squared and multiplied by the corresponding probability P1. Do this for P2,, P2xD2-D the whole squared and so on and so forth up to the last scenario, where the square is DN minus the average. The whole thing squared multiplied by PN. Once you have the sum of all these terms, take the square root of the entire sum and that gives you the standard deviation. So far, we have looked at a discrete probability distribution with a number of future scenarios. Each scenario with some attached probability. But what will happen to a discrete probability picture when the random variable being modeled has a really large number of scenarios on any small interval of values? And the probability that anyone scenario is realized, is really small. Think of examples such as stock prices or the amount of rainfall in a region. For example, there are very many possibilities, and very many scenarios of rainfall being between 37 centimeters to 39 centimeters. And the probability of the rainfall being exactly in one scenario let's say 37.1 centimeter is it really low. In the cases like this, it makes sense to describe such probability distributions. Using groups of scenarios rather than focusing on each individual scenario. Distributions like this are called continuous distributions, in the picture below, I show you the distribution of random variable X. The values of X are on the X axis and the corresponding probability densities are on the Y axis. Distributions like this are called continuous distributions. In the continuous distributions case, we're going to look at groups of scenarios rather than a single scenario. Again, the light green area shows you the probability that random variable x takes values between a minimum of x1 and a maximum of x2. And the area under the entire curve is equal to 1. If I ask you what's the probability the random variable X can take any value between the lowest possible point and the highest possible point, the probability must be equal to 1. And therefore, the area under the entire curve is equal to one. One of the most popular examples of a continuous probability distribution is the normal distribution. Normal distribution allows for the random variable X to take any value from negative infinity to positive infinity, as you see in the graph. The nice thing about normal distribution is that it is completely characterized by two parameters, the mean mu and the standard deviation sigma. Normal distribution looks like a bell curve and, Its likely the most commonly encounter distribution. There exist statistical formulas (also implemented in Excel) that calculate a probability. That a normal random variable with given mean and given standard deviation will produce a value between X minimum and X maximum in this interval. And this can be calculated on Excel. Other than normal distribution, there exist a large number of other popular continuous probability distributions exponential distribution, beta distribution, etcetera With easily computable mean, and standard deviation of variance. Each of those distributions is often used to describe a specific uncertain setting or quantity. For example, normal distribution is often used to describe the distribution of future percentage changes in the values of stocks or in the FX rates. Another example, exponential distribution can be used to characterize the time between successive arrivals of customers in a service system, such as a call center. Let's return back to the Characteristics of Forecasts, Point forecasts are usually wrong. Why? This is because the demand could be random variable. In the past few slides, we've been looking at how to describe demand distribution and how to characterize random variables. Using that information, a good forecast should be more than a single number. Forecasts should include some distribution information, typically the mean and standard deviation. It is also worth remembering that aggregate forecasts are usually more accurate. An accuracy of forecasts erode as we go further into the future. Therefore, long term forecasts are less accurate than short term forecast. Finally, don't exclude known information in your forecasting process. Unless you have a good reason to so. Let's examine some Subjective Forecasting Methods. First, Composites. Composites are way of aggregating forecasts that come from different locations, or different people, or different geographies. For example there is Sales Force Composites. Sales Force Composites are formed by aggregation of sales personnel's estimates of demand. There is Election Polling Composites. There are websites that aggregate polling data and put them together. There are Customer Surveys, customers point the subject to deviation of the different services, or demand, and that's put together. There's Jury of Executive Opinion. Jury of Executive Opinion is to collect informative data from executives and put them together. Then finally, there is The Delphi Method. It makes the individual opinions are compiled and reconsidered. And you compile and reconsider, and you repeat the process until a group consensus is hopefully reached. We'll return to Subjective Forecasting Methods nearly at the end of week one, during our last session. Let's look at some Subjective Forecasting Methods. The methods we're going to look at are Composites. Composites could be an aggregation of data. For example, the Sales Force Composites. Where the estimates from sales personnel are collected and aggregated together. Sales personnel are commonly in touched with customers. And they collect customer data and they collect forecast information and that's aggregated to come up with Subjective Forecast. Similarly, Election Polling Composites exist and these are done by websites that aggregate polling data. That is collected from customers. Similarly, Customer Surveys are Jury of Executive Opinion. Where there is a limited number of experts put together in a single location, who come up with forecasts and then combine the forecasts together. Delphi Method. In Delphi Method, the individual opinions are compiled and reconsidered. The collected opinions are put together and it is examined by their group consensus as it would. If there is no consensus, then the individual opinions are recompiled and reconsidered. And this process is repeated until consensus is reached. There are many ways of doing Subjective Forecasting Methods. We will return to Subjective Forecasting Methods at the end of week one in our last session, when we have limited demand data and see how we can do better. For now, we're going to focus on Objective Forecasting Methods. How do we forecast objectively using past data? We can leverage past data to come up with forecasts. The two primary methods that are used for forecasting are Causal Models and Time Series Methods. Causal models are models that are explained through causal analysis. Let's see what it means. Let D be the demand or future outcome that you need to predict and assume that there are end variables or end root causes that will influence the demand. A Causal Model is one in which the demand D is formulated as a function of those end root causes. As you can imagine, Causal Models are generally intricate and complex and therefore need advanced tools in addition to the main expertise. In this course, we will focus mainly on time series based models. What's a Time Series Method? A Time Series is just a collection of past values of the variable that's being predicted. In fact, it can be considered as a naive method. The goal is to isolate patterns in the past data to come up with good predictions about the future using the past data. The past data might have characteristics such as trend, seasonality or cycles in the data, or just randomness in your data. We use this patterns to come up with disrupt statistics, which are useful then in coming up with a prediction or with a forecast. And that's what we will do in the next few sessions. Continuing in the same vain, we're going to be doing forecasting, leveraging the past historical data. In particular, I'm going to be looking at two methods, one Moving Averages method and Exponential Smoothing, which I will cover in the advanced slides. >> In this session, we saw one of the fundamental problems in operations called the Newsvendor Problem. I showed you an example of the Newsvendor Problem in which you have to make an operational decision under uncertainty. The Ultrason Application of the Newsvendor Problem at a well known magazine firm. I've emphasized the importance of making good decisions in the face of uncertainty. In this course, we hope to guide you to making better decisions. To make better decisions, first we need to be able to describe the uncertainty in the data we collect. And we also need to use these data to forecast future events. These are exciting concepts. You'll see more of these concepts in the next session.