Hi. In this lecture, we're going to introduce something called a measure of dissimilarity, an index of dissimilan. We're going to use that to look at different cities and different regions and ask how segregated they are. So remember in the last lecture when I did Shelling's segregation law, and what we saw is that people had fairly tolerant thresholds for you know, living with people with different income groups or different races, still ending up being segregated. And so the result is when look at the cities across United States, we see substantial segregation by income, we see substantial segregation by race, and we want to know, what we want to do here is sort of figure out, can we construct some sort of measure, some we've you know, categorizing numerically how segregated if a particular city is along a particular dimension. because when you have those measures, right, that allows us to make, you know, better sense of data, like to use and understand data better, and that's one reason why you model. So, to get started, let's remind ourselves again of just what these patterns look like. So this is the city of New York, and remember that regions that are depicted in red are predominately Caucasian. Regions that are depicted in blue are predominately African American, yellow predominately Latino, and green predominately Asian. So New York is interesting because it's just, like, these big chunks of different racial groups spread out all over the city. Not all cities look the same way. Now here's Los Angeles. Right, Los Angeles has this area called the Valley which is mostly white, South Central which is mostly African American and then over by Monterey park it's mostly Asian. If you look at Houston, again you see all of these sort of interesting patterns, and how people are racially distributed across the city. And we look at DC, it's almost like there's a dividing line to the East, most people are African American, to the West most people are Caucasian. So, different cities look different ways, what we'd like to do is to have some sort of number for representing this racial disparity. Okay? Now, remember, the same is true for income, so we can use the same for income disparity. And if you look at a city like Chicago, what you see is that there's red represents wealthy people here. So there's wealthy people along this area known as the Gold Coast. In the center of the city, it's mostly poor people. And then, to the north and to the west, in the suburbs. Right, something that's called Collar Counties, it actually looks like a collar. These, again, are wealthy people. Again, New York, remember the red dots here represent rich people. People who make more than $200 thousand a year. All around Central Park here, you see wealthy people, and as soon as you move further out from the city, you see poor people. So it's interesting, New York is sort of a little bit different than Chicago, in that right in the center of New York there's a lot of wealth, and then as you move out, it gets poorer. Chicago sort of looks the other way. So what we want, is we want some measure for how segregated a city is. So construct, we're going to construct a very simple measure called the Index of dissimilarity. And we're going to do it with just two types of people, rich people and poor people. So to represent rich people with blue dots and poor people with yellow dots. Now, I'm going to place these people on a grid. So in a 24 city block area here and in each block, I'm going to put ten people, all right? So, let's start out, and let suppose let's start 12 of these blocks right here I put all rich people. And in six of these blocks I put all poor people. And in six of these blocks I put half poor half rich. So what does that give me total? Well remember I said we got 12 box here and I've got ten people per block so that's 120 rich people here. And I've got six blocks here but there's only five rich people per block in five fourths, so that's 30. So 120 plus 30 equals 150. So we've got 150 rich people. And then I've got 90 poor people. So, 240 people, total. 150 rich, 90 poor. And I want some way of representing, how segregated are these city blocks? Now, the interesting thing is, these districts here, these green ones, are less segregated than these blue ones and these yellow ones. So I want some measure that will capture that fact. Alright? So, how do I do it? So, I'm going to do this a minute. Let b be the number of people who live in a block, little b, and let big B be the number of people total. Then, if I take little b over big B, that's going to tell me the percentage of blue people in that block right, relative to the total number of blue people. So it's just going to be the proportion of the total number of blue people in that block. And similarly little wise or big Y yellow people in that block. Now why do I want to do that? Why do I want to look at those two numbers? Because, if I take the difference between big, b over B and y over Y, that's going to tell me how distorted the distribution is in that particular block. But I need to be more precise. Suppose I have a district that has five blue and three yellow, and I want to have a perfectly representative district. What that would mean is that 5 over 150 of the, there's 150 blue people and five of those blue people live in this particular block. So 5 over 150 equals 1 over 30. So one out of every 30 blue people lies inside that block. Now there's 90 yellow people in three out of the 90 yellow people live in that block so one out of 30 yellow people live inside that block or poor people live inside that block. So, 1 over 30 minus 1 over 30 equals 0. So what we get is that, if you had a perfectly representative block between rich and poor, what I'm calling blue and yellow, we'd have a difference of 0. But if we've got relatively more blue, or relatively more yellow, since I'm taking the absolute value, that's what these two lines mean, right here. The absolute value. It means that I'm going to get a positive number. So I'm going to have more, I'm just going to represent more segregation. So, let's look at our particular example. So these are, this block right here is all blue, right? So, there and there's ten blue people in there. Now, there's 150 blue people, total. So ten out of 150 blue people lie in that block. There is no, no yellow people, no poor people in that block. so I have 10 over 150 minus 0 over 90. So that equals 10 over 150, I can get rid of the zeroes, it equals 115th. So in every one of these blocks, my index is going to be one fifteenth. Now in these yellow blocks, right here, there's no blue people, there's no rich people, so that's 0 over 150. But there's 10, yellow people are poor people so that's 10 over 90. So there's way too many yellow people than there should be proportionally and so take 0 minus 10 over 90 I get 1 9th right? got these absolute value signs here so everything becomes positive. So these districts, these blocks are 1 9th. And finally I've got these green districts, now remember these have 5 blue, so, that's 5 over 150 and, they've got 5 yellow, so that's 5 over 90. Right, and I take the absolute value. What do I get there? Well, that's 1 over 30 minus 1 over 18. So, that's, this is complicated. We're going to find out that this is equal to 1 over 45. Okay? So this is 1 over 45. What we get then is every one of those ten blue districts, the index of the assembly is 1 over 15. Every one of the yellow districts, the index of similarity is 1 9th, and every on in the districts that's 5 blue and 5 yellow is 1 over 45. Okay. So, how do we figure out how segregated this whole region is? What we do is we say, we've got 6 districts, or blocks here that have a dissimilar of 1 over 45, so we get 6 times 1 over 45. And we get 6 here that are dissimilar to 1 9th, so we're going to add 6 times 1 9th, and then we've got 12 that have a dissimilarity of 115th. So we get 12 times 1 over 15. And if we add all that up, we get 72 over 45. So 72 over 45 is, it's a tentative measure, we're going to change this a little bit because, what does that mean, what does 72 over 45 mean, is that bad, is that good? So, let's, let's go through and let's sort of put our measure through the paces. So whenever you construct a measure, what you try and do, is do some extreme cases, to see how well it works, so, let's start out with a simpler case, to see if this measure sort of makes sense. And I've got 4 blocks, that are 4 blue 4 are yellow, and here's another case, where I've got all eight of them are 50 50 and let's compute our index of similarity in each of these cases. So, let's start with this one. Well, each one of these blocks is going to be five blue, right? And five yellow. The total number of blue and yellow, right? Since I've got 8 blocks, I've got 80 people. So that means there's going to be 40 blue, and 40 yellow. So, for each one of these blocks, I get 5 over 40 minus 5 over 40, which is 0. So every single block contributes zero and my total index of dissimilarities, dissimilarity is 0. So that's great, right, because that means that if I, if everyone is perfectly mixed, my index would be 0. So it seems like it's a pretty good index. But let's go back and look at this other case. So now I've got this case where I've got, you know, 4 that are all yellow, and 4 that are all blue. So, once again, I've got 40 yellow, and 40 blue. But now we've gotta think, for each one of these yellow districts, what do I have? I've got 0 over 40 blue minus 5 over 40, yellows. Right? I'm sorry, 10. 10. So we've got yellows, so 10 over 40 yellows. So what that means is that going to be equal to 1 4th. And since all these are the same I'm going to get a fourth, a fourth, a fourth, a fourth and so on and also for the blues. Right? By the same logic. So every single one of these is going to give me is a viable fourth. When I add all those up I get two. I don't get one I get two. So now I've got a bit of an issue so if people are perfectly segregated I get two and if they're perfectly mixed I get zero. So this suggests I've got a pretty good measure here but what I probably want to do is I want to divide it by two, right? So if I divide it by two then if I get if you're perfectly mixed, you get a score of zero and if you're perfectly segregated you get a score of one. All right? So, if I go to this case where there's 40 rich 40 poor and they're perfectly mixed my score is going to be zero. Because I get five black and five yellow in each districts oops this should be a five, right? So I get a score of zero. And when I do the other one I get a score of one. Now when I look at my thing here when 72 over 45 which didn't make any sense, now that's 72 over 90. And if I divide this by nine that's going to be 0.8 so it's 80%. So sometimes this is 80% segregation. Which seems pretty segregated. Now we can go back and we can look at our cities. So now we can go back and we can look at our census data. And we can look at a city like Philadelphia. We can ask how segregated in it, is it. And notice that it will hit 0.8 exactly the same as our example. So this tells us the score in Philadelphia is 0.8. Now we, if look at this map and say well how segregated is it, now we have a score. And now we can do things that compare Philadelphia to Detroit. Remember Detroit also looks segregated, but in Detroit even though it looks segregated the scores only 0.6. So Detroit is actually substantially less segregated. In Philadelphia if, if you look at these two pictures, here's Philadelphia and here's Detroit it's very hard to tell the difference between the two. Okay so what have we learned in this lecture? We've learned that [UNKNOWN] we can construct a very simple measure called the index of dissimilarity and by using that measure we can compare how segregated different cities are. And now once we've got this measure in our pocket, right, we can use it to measure segregation by race, segregation by income, and all sorts of segregation. It's a really useful tool to help us sort of take data and understand the world. All right. Thank you.