In our previous exercise, we'd looked at using specific mass points to characterize different potential levels of demand. For example, we had said with 30% one level of demand is going to occur, with 50% another level of demand may occur, with 20% another level of demand may occur, with 10% another level of demand might occur. When there are only a few possible outcomes, we can take that approach. But what about when there are many possible outcomes? What if we're trying to predict what sales are going to be? What if we're trying to make predictions as far as, what cost of goods might be? Well, that's where we can use probability distributions to characterize the extent of uncertainty that we have. So that's what we're going to talk about today. So as far as the plan is, we're going to talk about how can we use different probability distributions. Some that you may be familiar with, some that might be new to you. But how can we use these distributions to characterize the extent of randomness in the outcomes that we're interested in? So we'll take a look at how we can use these different distributions, where they're going to be appropriate? And of course, what are these Excel commands associated with these different distributions? And then, of course, how do we tie this back to decision making? And we'll tee up another lab exercise where we have another aspect of the profit equation that is uncertain to us. In this case, we're going to use a probability distribution to characterize that uncertainty incorporated into our calculations and use that to make the best business decision. All right, so going back to this framework that we had identified earlier, of course, we're going to build an evaluation model and that's what we're using to make the best possible decision. We just need to make sure that we're incorporating uncertainty in an appropriate way into that evaluation model. Right, so, here is one particular example where we have fluctuating levels of demand over time. And what we would like to do is come up with a way of predicting what's my best guess as far as what demand is going to be in the next period. Well, one of the measures that we can use to characterize, to summarize this data is going to be the average. Let me take the average of the historic data that's available to me. And that average might be over a small time period because I want to put more emphasis on the more recent data. But the problem is that's my best guess. It doesn't tell me how much uncertainty I have in that guess. So in this case, if I were to look at it, it seems like demand tends to hover around 30 units. But sometimes it's as high as 60, sometimes it's as low as 5 units. So I might say, well, my best guess is 30, but how likely is that to happen? Also, what we have to take into account is, I might not get 30 units. I might get 31 units, I might get 32 units, I might get 47 units of demand. Well, what we want to be able to do is say, here is my best guess, as well as here's how much uncertainty there is around that, so that when we're making our decision, we take that into account. We take into account that our prediction of demand isn't necessarily going to be perfect. In fact, chances are we're not going to predict demand with 100% accuracy. So what's important for us is to recognize how much uncertainty is there? Is it a very tight distribution around a prediction of 30 or is it a prediction of 30 with a lot of uncertainty? So, we're going to talk about ways a little bit later on to say based on factors that we observed, based on time of year, based on marketing activity, we can come up with that prediction for demand. But not only do we predict demand, we also predict how much error we might have in that prediction. And incorporate both of those factors into our decision making. All right, so some terminology that we're going to be using, and this goes back to what we talked about last time, we're going to be thinking about outcomes in terms of random variables. So if I roll a pair of dice, the outcome that I get each time I roll those dice is going to be different. That's going to be my random variable. If I'm looking at sales, every day that I'm observing sales, that's a different possible outcome that I observe. Some days sales are going to be high, some days sales are going to be low, so we might treat sales as a random variable. We're going to build a classification for different types of random variables. If they're a fixed number of possible outcomes, as was the case in the exercise that we had gone through last time with our inventory planning, where there were only three levels or let say there are only four levels of demand. We'd refer to that as a discrete random variable where there are a fixed number of possible outcomes. In a lot of cases, there are an infinite number of possible outcomes. Or there could be a granular number of sales outcomes, but it might be a very wide range. Well, if for a particular range if it can take on any value, that's what we're going to refer to as a continuous random variable. And this classification is going to come in handy when we're thinking about what are the appropriate probability distributions for us to be using? When is it appropriate for me, for example, to use a normal distribution? When it is appropriate for me to use a binomial distribution versus a Poisson distribution, or an exponential distribution? If you're familiar with those different distributions, great. If not, we're going to take a look at each of them and talk about when they're appropriate? Why we might prefer one versus the other? A little bit later on. Right, so, again, if we think about this in terms of risk and return, risk is characterizing that extent of uncertainty where we have calculations such as the standard deviation and the variance. The return, that's our expected value or the average. So these formulas that we're using, If we look at the Greek letter mu, often use to symbolize our expected value or the average. Well, the way that we would go about calculating that is, if I can enumerate all possible levels of the variable X. If I can list out all of these different levels, what I need to know is how likely am I to observe all of those different levels, and that's what this probability piece is saying. So if I'm rolling the dice, the possible outcomes range from 2, if I roll snake eyes to 12 if I roll two 6s. Well, how likely are each of those to occur? So I'm going to weigh the number of the roll by how likely am I to observe that outcome. And I'm going to add all of those products together and that's going to be how I calculate my expected outcome. Standard deviation, again a measure of dispersion. Well, that's going to be focused on, for each outcome, how far am I away from the average? How far am I away from that central tendency? And I don't care if I'm above or below it, I just care that how far away I am. That's why we're going to be squaring that term and weighing that difference by the probability of a particular outcome. So, for both of these calculations the expected value. And whether you're looking at the variance or the standard deviation, think of these as weighted averages. We're not just adding up all possible outcomes, dividing by the number of outcomes. We're weighing the average based on how likely a potential outcome is to be observed. And that's going to be important when we get into things like customer analytics, where we say some customers are worth a lot, some are worth a little. But how likely am I to get a customer that's worth a lot versus a customer that's worth a little? We're going to have to take those probabilities into account. Right, so this is returning to the example we have worked through in our previous lab. Where we had said there are five specific levels of demand ranging from 100 all the way down to 300. And we have the probabilities associated with each of those levels of demand. Well, if I wanted to calculate what is my expected level of demand using that weighted sum approach. The first term in this equation says I've got a 30% chance of having a demand of 100. I have 150 level of demand with a 20% chance. My chances of getting 200 units of demand is 30%. 15% chance of 250 units, 5% chance of getting 300 units. Adding all of those numbers together, my expected demand is going to be 172.5 units. Now, we can do these calculations by hand, fortunately if you have the data arranged in a tabular format such as this there's an Excel command, sumproduct. Where the two inputs into that sumproduct equation is you're going to highlight the outcome column. So in this case you would highlight the column that has our demand values ranging from 100 to 300. Enter a comma and then highlight the column that corresponds to our probabilities and that's going to give us that 172.5. So that's one way for us to get a, what's my average level of demands? Same approach would be taken if we wanted to calculate the variance or the standard deviation. So we said the average was a 172.5 units. So what we're going to do for calculating the variances? We're going to calculate for each level of demand how far away are we from the average. So 100-172.5, that's how far away this particular observation is from the average. And we're going to square that, multiply it by the probability of that observation happening. We do that for each of the different levels of demand that we have. So 300 minus the 172.5 average squared only happens 5% of the time, it gives us a variance of just over 3,600. Taking the square root of that gives us the standard deviation of about 60 units. So, using the variance, using the standard deviation, it's a way for us to characterize a degree of dispersion. Just how much dispersion do we have in the data? Now if we have for a particular case the normal distribution, the standard deviation has a very simple interpretation for us. If we think about a bell curve, how much of the likelihood is contained within one standard deviation, within two standard deviations, within three standard deviations. What you're going to see today is that the normal distribution, that's going to be the work horse for us. In a lot of cases, using the normal distribution is going to be a appropriate, it's familiar to us and a lot of our intuition is based around that. But we're also going to look at alternative distributions, because that normal distribution isn't always going to be what we're observing in marketing data. All right, so if we take a look back at the exercise that we completed in inventory planning which said, our average is 172.5 units of demand. But if we take a look at our expected profit, turns out that if we're only able to order in increments of 25 units, our best bet is to order only a 150 calendars. And if I could order exactly a 173 calendars, I've actually got a lower profit. So, in this case, turns out I'm better off ordering fewer calendars. And so what we're essentially saying is, based on the business context that we're looking at, because when I order calendars, if I don't sell them, I end up losing money. I can return them for some of what it cost me, but I end up losing money when I order too much. Turns out we're better off ordering fewer calendars than our expected level of demand. And so we don't just want to calculate, what's my expected level of demand? We need to take into account the cost structure in the business problems that we're dealing with. So, the approach that we're generally going to take, and we'll do that in the exercise that we work through in this session, is to specify the profit equation. That's our evaluation model. If I knew for example the level of demand that I'm getting, let's write out our profit equation. And then let's characterize the extent of uncertainty that we have using the appropriate distributions. And we're just going to conduct simulations that take into account how likely are different levels of demand. And what we may see is, in this, as we did in this inventory planning exercise, I'm better of ordering fewer calendars under this particular cost structure. There maybe other cases where I maybe better off actually ordering more than the expected level of demand. Or exactly equal to the expected level of demand. So we don't just want to stop with calculating statistics. What we want to do is incorporate those in to that evaluation model. And then use that tool that we've developed to identify what is the appropriate decision for us to make.