After the mean, the second summary statistic you would like to know for a random variable is its variance as a measure of spread. In this video, I'll explain how the variance of a random variable is calculated and also what happens to that variance when adjusting it via addition, or multiplication, or when combining different random variables. The variance of a random variable X is defined as the expected value of the squared deviation of X from its mean mu. If you want to calculate it on the basis of a probability distribution, it's the sum, or integral, of the square difference between the values that the variable may take and its mean, times their probabilities. These are the equations for that calculation, for a continuous and a discrete random variable. Calculating the variance for a continuous random variable is difficult, because you would need to integrate this function. Therefore, a discrete random variable is less complex than it may seem. Let's take an example. This discrete distribution give the yearly risk that you'll get involved in a traffic accident. The mean risk is 0.04, once in 25 years. First, you calculate the difference between the mean and each number of accidents. Then you square these differences. Multiply them with the corresponding probabilities and finally, sum the result. The variance of the accident risk appears to be approximately 0.06. If you'd rather like to use the standard deviation to express the spread in a distribution, you can take the square root of the resulting variance. Now let's see what happens with the variance of the random variable when that variable would be adjusted by adding a value a and multiplying with a value b. When you enter this transformation in the equation defining the variance, the constant a disappears but the factor b has been squared. Hence, by adding, or subtracting, a value a to a random variable, its variance doesn't change. But when you multiply it with a random variable b, its variance becomes the original variance times b squared. The standard deviation, the square root of the variant, changes then with the factor b. Time for an example. Did it ever occur to you that on a bright sunny day, people you encounter are more likely to greet you than on a rainy, gray day? This is the distribution of nods or smiles you can expect per minute when walking in a busy city on a gray day. A mega average of 1.4 nods per minute with a variance of 0.84. Now at the same time of day and location in sunny weather, everyone seems to have become friendlier. Two times as friendly, to be specific. Here's the nod and greet distribution under sunny conditions. There is the same category of grumpy people who never nod. But for the rest, you expect up to six smiles or nods per minute. Theory tells that the average number of nods should become 2 times 1.4. So, 2.8. And the variance should go from 0.84 to four times that value, which is 3.36. Let's check this by calculating the variance for the new distribution. This table shows the steps. We take the difference between the mean and each number of smiles or nods, square the difference, multiply it with the probability, and then sum it. Indeed, 3.36. By the way, do you know the unit that goes with this variance value? It's the unit of the random variable squared. So we're talking about smiles squared here. Let's now consider what happens with the variance when two random variables are added or subtracted. For random variables X and Y, the variance of the sum is the sum of the separate variances plus two times the covariance between X and Y. And the variance of the difference is an even more curious equation. It is the sum of the variances minus the covariance between X and Y. These are more complete equations which include multiplication factors a and b for X and Y respectively. These equations apply to the situation where any two random variables are added or subtracted. And apparently, it requires you know the covariance between these two variables. Covariance information is often not available and therefore we will not consider the general situation here, but rather a more restricted case, where the variables are not correlated. That's a lot simpler, because the covariance between uncorrelated random variables is 0. And these terms disappear from the equation. So in this case, it doesn't matter anymore whether you add or subtract variables. The resulting variance is always the sum of the separate variances. And you can moreover generalize the equation to any sum of random variables. Another noteworthy aspect, is that the standard deviation of the resulting sum of random variables is always smaller than the sum of the standard deviations for the separate random variables. It makes sense, some variability will cancel out when random variables are combined. Let me summarize what I've explained in this video. The variance of a random variable is the sum, or integral, of the square difference between the values that the variable may take and its mean, times their probabilities. Adding a constant to a random variable doesn't change its variance. But multiplication with a constant leads to multiplication of the variance with the squared constant. The variance of several uncorrelated random variables that are added or subtracted is the sum of the variances. Mind you, this only applies to uncorrelated random variables. The standard deviation is the square root of the variance, so to get the standard deviation after manipulating a random variable, you first determine the adjusted variance and then take the square root.