Knowledge by Association. We often hear people tell us things about associations between two variables. In so doing, they seem to be telling us something really profound, or something important. let me give you a surprising one that I heard once and it really had me scratching my head. I heard on the radio that, the more cheeseburgers that one eats, the lower the rate of dying of cancer. What? Eating cheeseburgers protects you from death of cancer, by cancer? It turns out it's true. And that sounds really impor, informative, right? It might make you think wow, I have to change my diet. I should go out there and start eating a buncha cheeseburgers. Now, as a vegetarian saying this to you, I have to tell you that's a very hard thing for me to say. But in fact, it's not the right course of action, either, because the reason that correlation exists is because eating a lot of cheeseburgers, increases by a significant amount your likelihood of dying of a heart attack. If you die of a heart attack, you can't die of cancer. So these things can be very tricky. Yeah, it's not, it's not, eating cheeseburgers is not reducing your likelihood of dying; you just die of a heart attack before you can die of cancer. So, you really shouldn't go an eat a bunch of cheeseburgers. Veggie burgers, maybe, not cheese burgers. this is the trick with, with association. So lets consider association in detail because there is a lot of value in these studies, but its just important that the consumer of this information understands the limits. So let's do it. Week one, Lecture seven, knowledge by association. I want to start by just giving you a real concrete feel of this. And, and this is one of, a really cool study in psychology that gets a lot of people's head kind of scratching like, what is this about? And it's all about marshmallows, and you would never understand the power that a marshmallow could have as a diagnostic instrument. But it turns out it does. So what I'd like you to do before I talk any more, is check out each of these links in succession. So first of all, click on the Stanford marshmallow experiment link, watch the video about that and then check out the article about it that appeared in The New Yorker. And think about all of that stuff and then come on back. Okay? Cool. All right. Welcome back. Pretty cool, eh? Wow. Just, you know, being able to wait to what's called delay gratification, those who could delay gratification longer, seem to have more success in life, measured all sorts of ways, including something as concrete as an SAT score. What does that study tell you? I mean, it's cool, it's fascinating. Does it give you answers? Or does it raise questions? In my opinion, it raises more questions than it does answers. And the answers it gives you are really kind of tenuous. I mean, it does tell you, well there's something there, there's some sort of link. There, there's a good reason to investigation this relationship further but it kind of stops there. All right, let me hold that point for a moment, because we, we're going to have to spend a little bit of time getting you used to some of the terminology people use when they talk about correlations. And some of the graphical depictions they use. So specifically, these things are things called scatter plots, so this is six different scatter plots. Each one of these scatter plots shows you a correlation coefficient within it, that thing that says r. So, I want to make these make sense for you, so let's go to a different example for now, an example I've kind of depicted over here. Imagine we asked a bunch of women to rate two things about themselves. the first is just their height. Tell us how tall you are. The second is on let's say a 1 to 10 scale, how attractive do you feel? How attractive of a woman do you think you are? And we want to know if there's a relationship between this, between height and attractiveness. But here's the twist. Let's say we ask 18 year old women this question, but we also asked 13 year old women this question. Why 13? Let me get to that. Let's start with the 18 year olds. What would we expect for 18 year old women? Well, our society seems to value tallness for some reason. We tend to associate tall people as being more attractive. And so what we might expect is that the taller women, will consider themselves more attractive. So, on this scatter plot, each one of these points represents a single person, and it represents two things about that person. How tall were they? You could figure that out by following this down to here. That tells you how tall they were. And how attractive did they feel they were? You figure that out. By going over here. So, this is somebody that's sort of shortish you, know short to medium and doesn't think they're very attractive. Where as this person relatively tall and does think they're attractive. this other person by the way is actually a little shorter than the one I was just telling you about. Just a smidge shorter but they really think they're attractive. Okay. So every one of these points is just that, a single person. And when we now lay these points out, we can get a sense of the shape of what's sometimes called the cloud. Lemme give you a sense of that. That's okay. You can kind of see it with your eyes if I just trace my mouse like this. This is like the cloud of points, and we want to know a couple of things about this cloud. We want to know which direction it goes. Does it go like this one, which is, sort of, up into the right. When something goes up into the right like that, as this line kind of shows you. That's what we call a positive correlation. That means these two variables have a very specific kind of relationship. As one variable gets bigger, the other one also tends to get bigger. OK? They go together. They're positively related, so the taller somebody is, the more physcially attractive they feel. Both of those things kind of grow together or shrink together. The shorter somebody is, the less physically attractive they feel. Okay, that's what we call a positive correlation when that's true. this number represents how strong that correlation is and it can range from zero to one. Actually, as you'll find out in a moment, it can range from minus 1 to 1. But let's, let's stick to the 0 to 1 for now. So in this case, the correlation is pretty close to that 1. It's 0.7. So this is what we would call a strong correlative relation. These two variables seem strongly interconnected. Now it could be the case that there is still a positive correlation, but it's not so strong. So in this case, notice how the cloud is a lot bigger, and that tendency to be going up and to the right is less extreme. if there's lower, there's a lower slope on this line what that suggests is we still have a positive correlation. If this was height and attractiveness again it would be, it's still the case that as people are taller they think they're more attractive, but that relationship isn't as strong as it is over here. There are some not tall people who think they're pretty attractive. And there are some tall people who don't think they're very attractive. and so its a weaker relationship between them. in fact if it was zero, the line would be completely flat and this would suggest there's no relation. That you can look at tall people. Some think they're attractive. Some don't. You look at short. Some of them think they're attractive. Some don't. This would be no relationship. A, a, weak relationship. A much stronger relationship. Now, this pattern, this sort of pattern can happen in this way, up and to the right. But it can also happen in this way, what we're going to call a negative correlation. In fact, this should have a negative sign in front of it. And I see that it doesn't right now. the negative correlation is a different kind of relationship. So now let's go back to our 13 year olds to make you feel what this means. We've asked the same question, how tall are you and how physically attractive do you feel. But at around 13, 12 or 13. It's an interesting time in children's lives because women go through puberty first. So they tend to grow and be quite tall, and the boys are lagging behind. And that leads to a phenomenon that we all know, this sort of phenomenon where the girls are dancing with boys that seem much smaller and younger than they do, less mature. now, I know from some of the girls in, that I went to school with, they used to call this the ogre time. because they felt like ogres, they felt like they were big and hulking over the boys. And they were very self conscious about their height. And so at this age, what you might actually find is that the taller a girl is, the less attractive she feels. So as height goes up ratings of physical attractiveness go down. So now these variables are what we call negatively correlated because they work in that opposite direction. They don't change together, they change and go in different directions. As height goes up ratings of physical attractiveness go down. So when that happens, you see this slope that's sort of downward and to the right but again everything else holds. You can, you can have that sort of relation where the points are very close to the line, you have a very small cloud of points like this, that's a strong one. but again you can have a weaker correlation that's still negative, and again this should say negative 0.3 here Or again, you can have this lack of correlation. but this is a good example, how these same two variables, the relation you see between them, may actually change over the course of a lifespan. There may be a time when tall girls feel less attractive, and then later in their lives, they may feel more attractive. Okay, again the idea of that is to give you just a feel of what these correlations mean and the importance of the direction. The positive versus the negative. And to introduce you to scatterplots. Because you're going to see those on occasion. Alright, now I want to end this section of association just by being clear about the limits. You can find two variables are related, but that really does not tell you any deep truth, you have to be very careful. It does bring to mind potential questions, but it doesn't give you really solid answers. So, let me give you this little gem, this is true. Taller people earn more money so, studies that have compared a persons height with their earnings show that the taller you are the more money you tend to make. Okay, cool interesting why you know, we, we get some obvious images in our mind well that probably means that when, when somebody's interviewing people. For some reason, they are biased towards tall people. So the tall people end up getting the job. Well, that's kind of what you want to think is true from this, but is that true? Is that, in reality what's going on? Well, maybe not. there can be all sorts of things that can cause two variables to be related. So for example, a lot of digging with experiments of this little tidbit have suggested the following. What's really going on here is what we call a third variable. Let me explain. That third variable will be socioeconomic status or how rich or poor a person is. And the claim is the following. If you grow up in a rich household, you're more likely to get good nutrition in your food. You're likely to get better foods as you're growing up and those better foods are going to maximize your potential for height, so rich people tend to be taller because of the food they eat. Now rich people also tend to be better connected in the job market. If your parents were well off, they probably hung around with other people who were well off, people who could potentially provide employment. And so the social network that you grow up in, is one of affluence. It's one where you're around other people who are making more money, and you have potentially the the, the potential to give you well-paying jobs. So that seems to be what's going on. People who grow up rich, become tall and end up in better-paying jobs. But they didn't get the better-paying job because they were tall, they got it because their family knew the right people. And they were tall because their family was rich. Okay, so it's that third variable that's causing both of these things, and making them look like they're related to one another, okay. there can also be other constraints. This is like the cheeseburger example that I led off with. You know, sometimes it looks like there's a relationship as though, one, eating cheeseburgers causes reduce cancer deaths. But in reality there's some constraint that's making that relationship true which is that cheeseburgers actually increase the likelihood of a heart attack death. So this is the problem with correlations. They're very enticing, they're very interesting they get our mind thinking all of that is good. But they also sometimes seem to suggest a truth, and very often that truth should not be trusted on the basis of a correlation alone. We want to go the next step, which I'll talk to you about in the next lecture, but just as a little extra for this lecture. This one's just silly, okay? I have a link here where a bunch of adults have recreated that child marshmallow test you've seen. and it's funny. That's the only reason I include it. It's just kind of funny to watch adults doing it. It's funny when the kids did it, and it's just kind of funnier when the adults did it. So I just threw that in there for, to give you a smile I guess. But here's a video about correlation and causation, some of the, some of the issues about correlations to think about. And then I have a couple of readings, that are going to show you the mathematical guts behind those correlation coefficients I was just talking about, how you compute them. I don't expect you to know the math. but you know, I, I would suggest you give it a look. because one of the things I think you'll see is that the math is not very complex. it's actually quite simple. and so, to the extent that you can understand the math, it can make you feel a little bit better, like you understand. What these correlation coefficients are. So I, I leave that up to you to, to investigate as deeply or not as you'd like. And then come on back, and we'll talk about how contrasting two situations can finally get us to a much harder truth about what's going on in some situation.