So if we have pairs of data, and we see that the scatter follows the linear form, then we can summarize these data by the average of the x's, the standard deviation of the x's, the average of the y's, the standard deviations of the y's, and finally the correlation coefficient r, which tells us something about the relationship between x and y. When we plot these pairs, x and y, then we use always the convention that whatever goes on the horizontal axis is called the explanatory variable or predictor. And the one which goes on the vertical axis is called the response variable. It turns out that the correlation coefficient r is always between -1 and 1. The idea is that the sign of r gives the direction of the association, whether it slopes up or down, and the absolute value of r gives its strength. Here are a number of examples. In the leftmost case, we have r equal to -0.9. Because r is negative, the whole scatter slopes downward. 0.9 means that the scatter is tightly clustered around a line. In the second example, we have r = -0.6. Again, the scatter is sloping downward, but 0.6 is much closer to 0, so we see that the scatter is much more spread out. Finally, if r = 0, then there's no perceptible upwards or downwards trend. Now let's go to positive correlation coefficients. If r = 0.2, we get an upward scatter, which is rather loose. Finally, if r = 1, the scatter slopes upward, and everything falls perfectly on a line. These examples give you an idea how to think about the correlation coefficient. Keep in mind that the correlation coefficient r comes without units, and that's because both x and y were standardized when we computed r. It also turns out that r is not affected by changing the center or the scale of either variable. For example, if you compute the correlation coefficient between height and weight, then it doesn't matter whether you measure weight in pounds or in kilograms. It's important to keep in mind, however, that the correlation coefficient is only useful for measuring linear association. Look at this example here. Clearly, there's a very strong association between the two variables. You see a tightly clustered scatter around a curve. However, if you compute the correlation coefficient, you see r = 0. r = 0 suggests that there's actually no linear association between these two variables. That's true but, of course, it misses the point in this case. The bottom line is that r is really only useful when we look at linear scatters. Finally, it's tempting to see a large correlation coefficient and conclude that there must be some type of causal correlationship between the two variables. But the example on the bottom left shows that this is just not true. That scatter plot shows shoe sizes as well as the score on a reading test for 100 schoolchildren. Clearly, there is a very strong association between these two variables. But we would agree that shoe size has no causal effect on reading ability whatsoever. Rather, what's going on here is that there's a third variable, namely the age of the school children, which determines both the shoe size and the reading ability. So, what's going on here is that correlation does not mean causation, and we've talked about that before.