We're about to start session two of the second week of our course. The subject of this session is correlation. A relation between random variables that can be used to reduce the risk associated with a combination of those variables. In session one we have looked at the scenario approach to modeling future uncertainly. We have introduced this approach using an example involving a single random quantity. Daily percentage of change or return on the price of a share of a hypothetical stock. In this session, we will continue to use scenario approach, but our focus will be on scenarios involving multiple random variables. In particular, we will look at the notion of correlation between random variables. Let's quickly remind ourselves what we have talked about in the previous session. We have used 20 historically returns on stock A to model the future realization of this return the next day. All possible values of the return were considered equally probable. For example, here's one scenario of the future return. The value of -0.024 percentage points, occurring with probability 1 over 20 or 5%. We were using 20 such possible values. And these 20 return values along with 20 attached probability values form the complete set of possibilities for the random return R. When I proceed to reduce this relatively large set of numbers to two quantities that provide a broad brush characterization of that random return. The first one was the expected value, that basically tells us where the return will be on average. We have associated a notion of reward with this expected value since we're likely to prefer higher expected return values, all other things being equal. The second quantity that we have used to broadly describe the distribution of the return values was the standard deviation of the returns. The standard deviation informs us about how wide roughly speaking the distribution of returns is. If a decision maker is averse to uncertainty of returns in general, then he or she may prefer to have the value of the standard deviation to be a small as possible. The standard deviation may serve as one of risk measures associated with the distribution of returns. We've also discussed other possible choices for risk measures, such as possibility of a loss. The key importance of reward and risk measures, is that they can be used in uncertain settings as objectives or constraining factors in a search for the optimal solution. Now, let's consider an example with two equally likely scenarios that describe simultaneous returns on two stocks, x and y. We consider only two scenarios not because it is a realistic thing to do, but because of this easier to understand the mechanism of correlation on a small example. There's also a reason why we'll look at two stocks. The notion of correlation were about two discuss, describes the relation between two random variables. We can come up with scenarios like these by looking at historical returns on any number of stocks recorded on the same trading day. So in scenario 1, stock X returns 0.4 percentage points and stock Y at the same time returns 0.3%. In scenario 2, stock X returns -0.2% and stock Y at the same time returns -0.1%. In other words, in this example, either scenario 1 occurs with probability 50%, and both stocks make positive returns or scenario two occurs with a probability of 50% and both stocks lose value. The fact of the scenarios here include simultaneous returns on two stocks helps us to understand whether the random returns on those stocks are more likely to colloquially speaking move in unison or move in the opposite directions. Let's first look at how this two stocks do in isolation one by one. Let's calculate the expected value and the standard deviation for the return on stock X. For the expected value, we get 0.5 times 0.004 plus 0.5 times -0.002 and that's 0.001 or 0.1%. And the standard deviation is the square root of 0.5 times the square of the difference between 0.004 and 0.001 plus 0.5 times the square difference between -0.002 and 0.001. The result is 0.003 or 0.3%. So stock X on average return 0.1% and the standard deviation of its return is 0.3%. Let's do the same calculations for the return on stock Y. For the expected value, we get 0.5 times 0.003 plus 0.5 times -0.001, which is 0.001 or 0.1%. For the standard deviation, we get 0.002 or 0.2%. So stock 1 on average returns 0.1% and the standard deviation of its returns is 0.2%. Stocks X and Y have identical expected returns and stock X has higher standard deviation of the returns. Now imagine a company putting $50,000 into each of the stocks today, the total of $100,000. What is going to be the fate of those $100,000 tomorrow? In other words, how much profit will the company see tomorrow as a result? Let's go through those calculations scenario by scenario. If scenario 1 happens, the company will observe a profit of $50,000 times 0.004 or $200 from stock X, and $50,000 times 0.003 or $150 from stock Y. So the total profit in this scenario is $350. On the other hand, if scenario two happens, then the company will lose 50,000 X 0.002 or $100 on stock X and it will also loose 50,000 times 0.001 or $50 on Stock Y. So that the total loss in this scenario will be $150. So the profit is random, it can take two values. $350 and -$150 each with probability 50%. We can calculate the expected profit and the standard deviation of a profit. The expected profit is 0.5 times $350 plus 0.5 times -$150 and that's $100. And the standard deviation of profit is the square root of 0.5 times the square of the difference between 350 and 100 plus 0.5 times the square of the difference between -150 and 100, so it is $250. Now let's look at third stock, stock Z into the banks. Suppose that there is the stock, stock Z such that it's returns have the same values as returns for stock Y, but they happen in different scenarios. In scenario 1, stock Y returns 0.3% and stock Z returns -0.1%. And in scenario 2, stock Y returns -0.1% and stock Z returns 0.3%. Again, assuming that such stock actually exists is not realistic, but this example will help us to understand a concept of correlation that we can certainly use in any real setting. As we did before for stocks X and Y, let us calculate the expected return and the standard deviation of the return for stock Z. These values for stock Z are the same as those for stock Y, not surprising. Since we have two equally probably scenarios, and scenario 1 for stock Y is the same as scenario 2 for stock Z and vice versa. Now what if our company puts $50,000 into each of the stocks X and Z today, the total of $100,000? The same question as before, how much profit will be earned on those $100,000 tomorrow? As before let's go through this calculation one scenario at a time. If scenario 1 happens the company will get a profit of 50,000 times 0.004 or $200 from stock X and a loss of 50,000 times 0.001 or $50 from stock Z. So the total profit in this scenario is $150. If scenario 2 happens then the company will lose $50,000 times 0.002 or $100 on stock X, but it will gain $50,000 times 0.003 or $150 on stock Z. So that the total profit in this scenario will be $50. Now the expected profit is 0.5 times $150 plus 0.5 times $50 and that's $100, and the standard deviation of the profits is $50. So let's compare this to portfolio side by side. If the company splits $100,000 equally among stocks X and Y. The expected profit will be $100 and the standard deviation of profit will be $250. If on the other hand, the company puts $50,000 each into stocks X and Z then the expected profit will be $100. But the standard deviation of the profit will be $50. This is a five fold reduction of risk as measured by the value of the standard deviation. We can look at this from a different perspective. Imagine that the company puts the entire sum of $100,000 into stock X. Such a single stock portfolio will have the expected profit of $100, and the standard deviation of profits of $300 as this calculation shows. On the other hand, if the company puts the entire sum of $100,000 into the stock Y. The resulting single stock portfolio will have the expected profit of $100 and the standard deviation of $200. So comparing the three choices, putting the entire sum into the stock X, putting the entire sum into the stock Y or splitting it equally between X and Y. We see that all three choices in this case produce the same expected profit. This is of course because the expected returns on stock X and Y are the same. And the standard deviation of the mixed option is between the standard deviation values for stock X and stock Y. Let's now make the same comparison using stocks X and Z. If the entire amount of $100,000 is allocated to stock X, we get as we saw before the expected profit of $100 and the standard deviation of profit of $300. If the entire sum is allocated to stock Z, the resulting profit has the expected value of $100 and the standard deviation of $200. But when the company allocates $50,000 of each of stocks X and Z, the resulting profit has the expected value of $100. Again, since the expected values of returns on stock X and stock Z are the same. But the standard deviation of the profit for this mixed portfolio is less than the standard deviations of profits on either single stock portfolio. In simple terms stocks X and Z when put together partially cancel each others quote unquote fluctuations around their respected expected values produced in a portfolio with a much reduced risk. So returns on stocks X and Z are related in some special way, which is quite different from the relation between the returns on stock X and stock Y. Note that the returns on stocks Y and Z are also related in a special way. If a company puts $50,000 in stock Y and $50,000 in stock Z. The resulting portfolio will have a profit of $100 under both scenario 1 and scenario 2. In other words, such a portfolio will have 0 risk associated with it. It's profit is not a random number, but a guaranteed value of $100. So why is it that when we combine stocks X and Y in the same portfolio, it is not as beneficial for risk reduction as combining for example, stocks X and Z? Let's have a look. Comparing stocks X and Y, we see that under scenario 1, the returns on X and Y simultaneously go above the respective expected values. On the other hand in scenario 2, returns on both stocks go below their respective expected values. Random variables that do that are called positively correlated. If we look now at X and Z, we see that in scenario 1 and in scenario 2, they go so to speak in the opposite directions with respect to their expected values. Random variables like that are called negatively correlated. Now let's look at a formal definition of correlation between random variables A and B. Correlation is the ratio of two quantities. The first quantity is the difference between the expected value of the product of two random variables and the product of their expected values. The second quantity is a product of the standard deviations of A and B. Whether we calculate the correlation between A and B or between B and A, we're going to get the same result. Let's use this formula to calculate the value of correlation between returns on stocks X and y. We need to first calculate the expected value of the product of those two returns. So for each scenario we calculate the product of two returns multiply that product by the probability of that scenario and add the results. The answer is 7 millionth or 7 times 10 to the -6. Flag in the required values into the formula for the correlation, we get exactly 1. Random variables with correlation of plus 1 occult perfectly correlated. Now let's repeat the same calculation for the returns and stocks X and Z. The expected value of the product of 2 returns is -5 millionth or -5 times 10 to the -6. This results in the correlation value of -1. Random variables with such correlation are called perfectly anti-correlated. Just to compare, if we do the same calculation for the return some stocks Y and Z. We get the expected value of the product of 2 returns as -3 millionth or -3 times 10 to the -6 and again, the correlation value of -1. So the stock returns are also perfectly anti-correlated. So a few points to keep in mind. Our examples are extreme and are created to illustrate the notion of correlation. Any correlation values will fall between -1 and 1. As our examples show if the goal is to reduce risk one should consider combining negatively correlated random variables. In the session, we have used an example of two stocks to illustrate an approach that relies on negative correlation between two random variables to reduce the risk associated with their combination. In the final session of week 2, we'll employ an optimization approach to illustrate a trade off between risk and reward in the portfolio of two stocks.