Let's talk a little bit now about Bayes' theorem. Let A and B be two events for which the probability of B is non-zero. Then, the probability of A given B, and this notation we'll use throughout the course. This vertical line means it's a conditional probability. So the probability of A given that B has occurred. Well, this is equal to the probability of A intersection B, divided by the probability of B. Alternatively, we can actually write this, this numerator probability of A intersection B, as being the probability of B given A by the probability of A. So this is another way to write Bayes' theorem. Finally, if we like, we can actually expand the denominator here, the probability of B, and write it as the summation of the probability of B given Aj by the probability of Aj, and we sum over all the j's where the Aj's form a partition of the sample space. What do I mean by partition? Well, I mean the following. So, Ai intersection Aj is equal to the null set for i not equal to j, and at least one Ai, not at least, at least one Ai must occur, and in fact because Ai intersection Aj is equal to the null set for i not equal to j, I can actually replace this condition with the following, exactly one Ai must occur. Okay. So that's Bayes' theorem. Let's look at an example. So here's an example of where we're going to toss two fair six-sided dice. So Y_1 is going to be the outcome of the first toss, and Y_2 would be the outcome of the second toss, X is equal to the sum of the two, and that's what we plotted in the table here. So for example the nine here comes from five on the first toss, and four on the second toss, 4 plus 5 equals 9. So that's X equals Y_1 plus Y_2. So the question we're interested in answering is the following, what is the probability of Y_1 being greater than or equal to four, given that X is greater than or equal to eight? Well, we can answer this using this guy here on the previous slide. So this is equal to the probability that Y_1 is greater than or equal to 4, and X is greater than or equal to 8, divided by the probability that X is greater than or equal to 8. Okay. So how do we calculate these two quantities? Let's look at the numerator first of all. So we need two events here, Y_1 must be greater than or equal to four, and X being greater than or equal to eight. Okay. So the first event is clearly captured inside this box here because this corresponds to Y_1 being greater than or equal to four. So all of these outcomes correspond to that event. The event that X is greater than or equal to eight corresponds to this event or these outcomes. So therefore, the intersection of these two outcomes for Y_1 is greater than or equal to four and X is greater than or equal to eight, is this area here which is very light, so let me do it a little bit darker, so it's this area here. Now, each of these cells is equally probable, and occurs with probability one over 36. There were a total of 3, 3, 7 plus 5, 12, so that's 12 cells here. So the numerator occurs with probability 12 over 36, and the denominator, the probability that X is greater than or equal to eight well, that's what we highlighted in the red here, and the probability of that occurring. Well, there's 12 plus these three additional outcomes equals 15 outcomes. So that's 15 over 36, and that is equal to 4 over 5. So that's our application of Bayes' theorem. Okay. So let me talk a little bit about continuous random variables. We say a continuous random variable X has a probability density function or PDF, f if f of x is greater than or equal to zero, and for all events A, the probability that X is in A or the probability that A has occurred is the integral of the density f of y, dy over A. The CDF, cumulative distribution function and the PDF are related as follows, F of x is equal to the integral from minus infinity to x of f of y dy, and of course that's because we know that F of x by definition is equal to the probability that X is less than or equal to x, so this of course is equal to the probability that minus infinity is less than or equal to X is less than or equal to little x. So, this is our event A here, and this definition here. So A is now the event minus infinity the random variable X less than or equal to little x, and so that's what we have over here. So, it's often convenient to recognize the following that the probability that X is in this little interval here, x minus epsilon over 2 and x plus epsilon over 2. Well, that's equal to this integral x minus epsilon over 2 to x plus epsilon over 2, times f of y dy. If you like we can draw something like this. So this could be the density f of x, this is x here, maybe we've got some point here which is little x, and this x minus epsilon over 2. This is x plus epsilon over 2. So in fact what we're saying is that the probability is this shaded area, and it's roughly equal to this value, which is f of x times epsilon, which is the width of this interval here. Okay, and of course approximation clearly works much better as epsilon gets very small. Okay. So there are continuous random variables, let me talk briefly about the normal distribution. We say that X has a normal distribution or write X Tilda N Mu, Sigma squared, if it has this density function here. So f of x equals 1 over root 2 Pi Sigma squared times the exponential of minus x minus Mu all to be squared divided by 2 sigma squared. The mean and variance are given to us by Mu and Sigma squared respectively. So the normal distributions are very important distribution. In practice, it's mean is Mu, its mode, the highest point of the density is also at Mu, and approximately 95 percent of the probability actually lies within plus or minus two standard deviations of the mean. So this is approximately equal to 95 percent for a normal distribution. Okay. So this is a very famous distribution, it arises an awful lot in finance. It certainly has its weaknesses and we'll discuss some of them as well later in the course. A related distribution is the log-normal distribution, and we will write that X has got a log-normal distribution with parameters, Mu and Sigma squared if the log of x is normally distributed with mean Mu and variance Sigma squared. The mean and variance of the log-normal distribution are given to us by these two quantities here. Again, the log normal distribution plays a very important role in financial applications.