Welcome to the last module of An Intuitive Introduction to Probability: Decision Making in an Uncertain World. This last module covers the normal distribution, perhaps the most famous and most important probability distribution in everyday applications. However, before we can really talk about the normal distribution and the famous bell curve, we have to talk about the concept of a continuous random variable and a continuous probability distribution. And that's what this first lecture of this module is all about. Let's dive right straight in with the definition. A continuous random variable X can take on a continuum of possible values. Some people would say an uncountably infinite number of values. For example, many variables in every day life are not discrete, they don't jump from value to value, like if you roll a fair die. One, two, three, four, five, six, or in a casino, at a roulette table, zero, one, two, three, four, five and so on. Instead, they take on continuous values. Here are some examples. A time to finish a task or the length if I produce an item, the lengths of the item, the thickness, the width. Now you may say, wait a minute. But time, I count in hours, in minutes, in seconds. Yeah, but wait a minute. This is only our human limitation in measuring time. We can think then of hundredths of a second, thousands of seconds, microseconds, nanoseconds. Time doesn't jump, time goes continuously. If we think of it as discrete, it's only because our limitation of measurement. So, continuous variables do exist. In addition, believe it or not, sometimes it's easier to think of a variable as being continuous, than instead of having it discrete values. Stock prices are certainly discrete. We go in cents, or here in Switzerland in rappen. Nevertheless, it's sometimes easier to model these random processes with a continuous random variable. So, continuous, that's really a tricky concept. And now I want to spend the rest of this lecture thinking what is continuous variable? What does this really mean? So, let's think of a random variable that can take on any real number between zero and one. And then I can ask a question, and let's start with a little in-class question. What is the probability that this random variable will take on exactly the number zero point, one, two, three, four, five, six, seven, eight, nine? Think about this, what is this probability? Before I give you the answer to this in-class question, I want you to have a look at a little spreadsheet that I prepared. Because Excel has this beautiful random variable function, RAND, that allows us to simulate the random variable on zero, one. Let's do that. So, here now, I prepared a little spreadsheet for you using the random number function in Excel. Here, in every version of Excel, you have this beautiful function RAND(). This particular function gives you a random number between zero and one. And so, now look at these numbers. They are all different. Now, what do you think is the chance I get a 0.123456789? Let me try this again. We have these random numbers, I click Enter, we get ten new numbers. Look at this, they change all the time. However, if you want to bet on a particular number, this is hopeless. If you think there's any chance 0.5, 0.55 or from the in-class quiz question, 0.123456789 shows up, it's not going to happen. Probability is zero. As a little aside, for the techies among you, here in Excel, of course, these are not truly random numbers. They're limitations to the computer. They're only so-called pseudorandom numbers. And I only get a limited number of digits here, so technically, we only have finitely many numbers. However, this is meant as a representation of the true continuum. And in that sense, the probability of every number is zero, and we cannot hit any number that I give you ahead of time. So, let's now wrap up this idea, move back to the slides, and continue with continuous random variables. There's really an infinite number of numbers between zero and one, and we cannot count them. If you think you can count them, give me the number that comes after zero. If you say it's 0.001, I say no. There's another number that comes first, 0.000001. And even there are many numbers, infinitely many that come before that. Similarly, you can't give me the next number after 0.5 or give me the last number before 0.5. So, we see this continuum doesn't allow us to give positive probabilities, because there's so many numbers, that the total sum of probabilities, in the end, would be larger than one, and we know that cannot be. Think back to the exams we had way back in the beginning of the course. So, the probability of every individual number must be zero, so, or said differently, no individual number can have a positive probability. This is so key, that I made a little theorem out of it here. Every continuous random variable in the universe has a property that's a probability of this random variable taking on a value x, is zero. So, we can no longer talk about individual probabilities, that's just nonsensical. So, what can we still do at this point? Let's think about another in-class question here. What is the probability that this random value that I showed you in the Excel sheet comes up with a number between zero and a quarter? Think about that. [SOUND] [SOUND] So, between zero and a quarter, there are also infinitely many numbers. And if you play with the random number sheet, with those random numbers I showed you, you will notice that in 25% of all cases, you get a number between zero and a quarter. Similarly, between a quarter and a half, 25% of the time you fall into that interval. And a quarter of all numbers are between 0.5 and 0.75, and similarly, 0.75 to one. This looks awfully uniform. And that's the name of this particular distribution that we simulated in the Excel spreadsheet. It's a uniform distribution on the interval 0,1. And there's nothing special on having the width of the interval a quarter. Here, I show you intervals of the lengths of 0.1 between 0 and 0.1, 0.1 and 0.2, all the way to 0.9 and 1, they all show up with 10% probability. And there's nothing special that they're all sitting next to each other. If you think about how likely is it to get a number between 0.18 and 0.28, then probability is also 0.1. So, we have a uniform distribution here, and it doesn't make sense to talk about probabilities of points. But as soon as we have a little range within our larger range, we can talk about positive probabilities. How can you represent that? And this is now the big change from discrete random variables to continuous random variables. In discrete random variables, we have single numbers, and they have probabilities. We can't do that here. Here, we represent probabilities through areas underneath a curve. Let's look here at this illustration. Any number between zero and one is possible. It has probability zero, but we get a number between zero and one. We don't get negative numbers. We don't get numbers larger than one. Now, look at this rectangle here. A square, we have the whole area is really one. It has a width of 1, it has a height of 1, so the whole area is 1 times 1. Now, here we look at the probability between 0.18 and 0.28. This has a width of 1, and a height of, sorry, a width of 0.1 and a height of 1. So this has an area of 0.1, within the whole area of 1. So, this is now 10% of the whole area. And this is how this green rectangle within the larger box represents the probability, and this is how we do it in general. As a quick aside, for those of you who remember integration from high school, there, we also look at areas underneath a curve. Those are integrals. And if you look at an integral from a point a to the same point a, that integral for any function was always equal to zero. So, here, the representation of probabilities underneath curves corresponds to what you learned in high school about integration. That's just lesson aside for those of you who know this. Now, this function that we need to describe the curve is called a probability density function, or PDF for short. It has the following properties, this function needs to be greater equal zero for all elements. We cannot have negative elements. It's either equal to zero or positive. The entire area underneath the curve has to be one, why? Think back to the very beginning, the basic rules of probability. The probability of the sample space, everything that's possible, always must be one. The same thing is true here. That's now, in technical terms, it means the integral of the entire area has to be equal one, that's for the techies among you. You can just think of it that the total area under the graph of this curve is equal to one. Now, since it doesn't make sense to talk about individual probabilities, the only probabilities we can talk about is for areas, for larger ranges. And therefore, for continuous random variables, the key function is the cumulative distribution function, the CDF, typically noted by a F. And that allows us to then talk about probabilities. And then I can use this function now to also calculate the probabilities for fixed ranges, as a probability over an interval, as I showed you in the earlier examples. There now the probability that a continuous random variable falls between a lower number a, and a larger number b, is then just the difference of the cumulative distribution at b, and minus cumulative distribution at a. And so here, I show you the technical calculation that goes along with this very intuitive picture. Finally, you can also draw the cumulative distribution function, which I did on this graph for you. And so, the cumulative distribution function starts out at zero, moves up to one, and then stays one forever. In my experience, students appreciate this function less. They prefer graphs underneath the density function. So, for the remainder of this course, I will also focus on that, areas underneath the density. So, to wrap up this very long, but very important lecture, when we talk about the normal distribution, we need to get a good feel for continuity, what it means for a distribution to be continuous. Therefore, we briefly talked about continuous random variables and then looked at the most simple continuous distribution, namely the uniform on 0, 1. And the key takeaway that I need for you to understand is a representation of probabilities as areas underneath a curve. That's a key concept which we will need to calculate probabilities for some cool applications as we go forward. So, please come back for more lectures. In the next lecture, we start with the normal distribution, and then you will see continuous distributions in action. Thanks, and see you soon.