Hi, and welcome back. In this module, we're going to begin studying continuous random variables. We'll begin with their definition and some examples. We'll also study the uniform random variable. In subsequent modules, we'll look at the exponential random variable and the normal or Gaussian random variable. Let's begin with a definition. Random variable is called continuous, if its values comprise either a single interval or possibly the union of disjoint intervals. For example, suppose you're studying the ecology of a lake, a random variable X could be the depth measurements at randomly chosen locations. In that situation, X is going to take on values from zero up to the maximum depth of the lake. Maybe you're studying a chemical reaction, Y could be the concentration level of a particular chemical in the solution. An important example of a continuous random variable is customer service and service times. Maybe you would let W be the time a customer waits for service. In all of these situations, and in fact, with all continuous random variables, the probability that X equals any specific value is always zero. Why that's true, we're going to see in the next couple of slides. I want to start with a discrete, random variable. Suppose a train is scheduled to arrive at 1:00 PM. The train never arrives early and it never arrives more than five minutes late. Let's let X be the minutes past the hour that it arrives. X is going to be 0 through 5. This is a discrete random variable and we have the probability mass function listed here. I've also put in a histogram. In this situation, the area under the rectangle also represents the probability of that value. For X equaling two, the probability is 0.3 and the area in this rectangle is 0.3. Now suppose instead of minutes, suppose we measure in terms of seconds. What happens to the histogram then? The histogram is going to get much finer, like that. Suppose instead of seconds, we measure in terms of milliseconds. Suppose we do nanoseconds, what happens to the histogram? It gets continually finer and finer. In the limit, perhaps we get some curve and we're going to call that curve f of x. Now, the probability that X, our random variable, is between 1.5 and 2.5 with the curve Y equals f of x. Now I'm switching the X from a discrete random variable to a continuous random variable where it can take any value between zero or maybe even a little less than zero and a little bit more than five, but it can take any value in there. Now I'm thinking of X being in the interval from 0-5. Our probability that X is between 1.5 and 2.5, that's going to be an integral from 1.5-2.5 of f of x, dx. The area under the curve Y equals f of x corresponds to what was for the discrete random variable, the area of the rectangle. We call Y equals f of x the probability density function for the continuous random variable X. Furthermore, the probability that x is between a and b, for any a and b, is going to be the integral from a to b of f of x, dx. Probability density functions have certain properties, what do we expect those properties to be? Well the first thing we expect is that, our probability density function is always greater than or equal to 0. If F is ever negative, then the integral from a to b of that negative density function is going to give a negative probability and that doesn't make sense. F has to be greater than or equal to 0. Also, the integral from minus infinity to infinity, of f of x has to be equal to 1. We have to accumulate all that probability between minus infinity and infinity. This is equivalent to the probability of our sample space being equal to 1. Furthermore and we noted this on the previous slide, the probability that x is between a and b is the area under the curve of f of x. Now, it should be clear why the probability that x equals a, that's going to be the integral from either a of f of x, dx equals 0, for all real numbers a. This is a little bit of a paradox for continuous random variables, because when you actually measure the continuous random variable, you get a number. But the probability that it's actually equal to any single number is 0. Let's look at the cumulative distribution function. The definition is exactly the same, as it was for discrete random variables, f of x is the probability that x is less than or equal to some little value x. We're accumulating all of the probability up to that little value of x. For continuous random variables, that's going to be the integral from minus infinity up to x of f of t, dt. What do we expect to be true for the cumulative distribution functions? We expect three things. The first is that our cumulative distribution function, is always between 0 and 1. The 2nd thing we expect is that, the limit as x goes to minus infinity of f of x, that's going to be 0. There's no probability that the random variable x will have any probability down at minus infinity. Also the limit as x goes to positive infinity, that's going to be equaling one. That again says that we've accumulated all the probability up to infinity, so we have to have probability of one. The 3rd thing that has to be true is f prime of x, is equal to f of x, and that's true by the fundamental theorem of calculus. The fourth thing that's true, so I lied, there's four things; f of x is always increasing. We're always, as we accumulate more probability, as our x value, our little x gets bigger, we accumulate more probability, so the cumulative distribution function is an increasing function. Let's look at a uniform, random variable. A random variable x has the uniform distribution on the interval, if its density function is given by little f of x equals 1 over b minus a, for all xs between a and b and 0 everywhere else. Our notation is x has the distribution of a uniform random variable from a to b. Let's see what the graph looks like. Maybe a is here and maybe b is here and, the height of this line is 1 over b minus a. The density function is zero here and here. If x is less than a, the density function is zero. If x is greater than b it's also zero. What does the cumulative distribution function look like? Let's write this down, so f of x is equal to the probability that x is less than or equal to x, that's going to be the integral from minus infinity up to x of our density function. This is going to be equal to the integral from a to x. Notice that the density function is zero all the way up until a, so there's no contribution to the integral, so we just get 1 over b minus a, and we start our integral at a. This is going to be for x's between a and b. When we integrate this, what do we get? Well, we get zero. If our x is less than a, we get x minus a over b minus a, if our x is between a and b, and we have one if our x is bigger than b. Once our x hits b, we've accumulated all the probability and so we have one. You can also see that's true here when x equals b, we get b minus a over b minus a and that's equal to one. If we graph the cumulative distribution function, we have zero all the way up to here. When we hit b, we're up here at one and it's a straight line up to there and then it's a straight line afterwards. There's lots of uses for uniform, random variables, random number generators, select numbers uniformly from a specific interval, usually the interval zero to one. Suppose the diameter of aerosol particles in a particular application is uniformly distributed between two and six nanometres. Find the probability that a randomly measured particle has diameter greater than three nanometers. Well, in this situation, our x is going to be uniform two to six and our density function is going to be one-fourth for x's between two and six, and it's going to be zero everywhere else. The probability that our random variable is bigger than or equal to three is also the same as 1 minus the probability that x is less than three and that's going to be the integral from two to three of one-fourth dt. That's also the same as the integral from three to six of one-fourth dt. In both cases we end up with three-fourths for our value. We can use uniform random variables to indicate radial distance of a dart thrown at a dartboard. In this case, we think of the x axis here and wherever our dart ends up. That radial distance is a random variable. If we measure it in terms of degrees then y would be uniform 0 to 360 degrees. Let's calculate the expected value and the variance for a continuous random variable x. First, we have to recall if y is discreet. We had the expected value of y, is the sum of all k's of k times the probability that y equals k and the variance of y was the sum of all possible k's, k minus the mean of y^2 times the probability that y equals k. How does this change when we have x continuous with probability density function f of x, what changes with the definition of the expected value? Well, a summation in the discrete situation becomes an integral in the continuous situation. We're going to integrate from minus infinity to infinity. The k becomes an x now, and then the probability that y equals k, that becomes f of x, dx. Likewise, the variance takes on a similar form x minus the mean squared times the density function f of x, dx. Just as in the discrete case, we can expand this and we get minus infinity to infinity x^2 minus 2, Mu of x times x plus Mu of x^2. All of that f of x, dx and this becomes three separate integrals x^2, f of x, dx minus 2 Mu of x integral x f of x, dx plus Mu of x^2 integral minus infinity to infinity f of x, dx. This integral is one because we're summing or integrating over all the probability, this integral is the expected value of x and this integral is the expected value of x^2, so the 2nd moment. We end up getting the same computational formula as before the expected value of X^2 minus the expected value of x quantity squared. Now, suppose we want to compute the expected value and variance for a uniform random variable on the interval from a to b. In this case, f of x is the same as it was before, 1/b minus a, for x' s between a and b, and zero everywhere else. The expected value of x is going to be the integral a to b of x times 1/b minus a, dx. When we integrate this, we end up with 1/b minus a, x^2/2, evaluated from a to b. That's going to be b^2 minus a^2/2 times, b minus a you can factor the numerator and cancel out the b minus a's and we end up with b plus a/2. Now, think about this. This is what we expect since it is midway between a and b. We saw the exact same result, for discrete uniform, random variables. What about the variance? We can compute the second moment so we get the integral from a to b of x^2, 1/b minus a dx, that'll give us 1/b minus a x^3 /3 evaluated from a to b, b^3 minus a^3/3 times b minus a factoring will give us and canceling the b minus a terms, It'll give us b^2 plus ab plus a^2 divided by 3. That's the second moment. Then putting that together we get the variance will be the second moment minus the mean. Squared, so we have b^2 plus ab plus a^2/3 minus b plus a/2 quantity squared, a little bit of algebra and some rearranging, and we see that we end up with b minus a ^2/12 as our variance. In the next video, we'll study the exponential random variable and some of the examples that are related to that. See you.