Okay, so we've seen a bunch of different centrality measures, so let's take a look at an application which can begin to distinguish between them. And let me emphasize just from the start that when we're doing this application, it's not designed to show that one in centrality measures is always better than another. But just to show that in particular context, we can actually say something systematically about which ones seem to be working better than others in making some predictions. And the question that we're going to be looking at is diffusion, and we're going to be looking at the first contact points in a process. So there was a, in this case, um... diffusion process that was started and we have some idea of which points in the network were first contacted. And then we see what the diffusion looks like, and we have a bunch of different networks going on. And we can try and compare across the different networks and say, how does the, the, the centrality of the nodes predict how successful the diffusion would be? The eventual diffusion. So let me put this in context, and this is a part of a joint project that I've have been involved with for a number of years with Abhijit Banerjee, Arun G Chandrasekhar, and Esther Duflo. And what were looking at in particular was diffusion of micro finance in 75 rural villages in Karnataka, which is Southern India. These were villages that were fairly remote and isolated from outside alone availability initially, and a partiuclar bank, BSS, entered 43 of these villages and offered micro-finance to them. And we went in and surveyed the villages and mapped out social networks before the, the lending agency went into these villages, and then we tracked the microfinance participation over time. So we've got diffusion over time. And we can look at, we know the initial points that they touched, who, who they first told about microfinance. So the bank would come into a town and say, look here's a, a group of people that we want to so what they did in each village was identify a particular set of people that they should talk to first. shopkeepers, teachers, self-help group leaders, people that they thought might be well connected in a village, and then they told those people look, we're going to come in and we're going to offer loans. we'll be back in a couple of weeks. Tell your friends about it and have them spread news and then in a couple of weeks we'll come back and then tell you more about it. And then over time they kept coming back every two weeks, and then people could join the loan program and so forth. And across these different villages, in some villages, they would get an eventual participant rate in the loan program of you know, say mid-40's. About 44% was the highest of any villages. The lowest of any villages was about 7%. And one thing we can ask is, did it matter which points, which people they talked to in a village first? So it might be that in village number one the teacher's a very central individual. But it happens that in village number 12 the teacher's not a very central individual. So if you talk to the teacher in both villages then in one village you're talking to a very central individual, another village you're, you're talking to a non very central individual. Then, does that make a difference in terms of what the eventual microfinance participation rate was. Does it make a different in how much news got out? so we have 43 different villages and we can look at how central those nodes are and we can use different notions of centrality that we've looked at and see which ones work well and which ones don't. So, just to picture Karnataka here. so actually in, the slide got a little distorted, but this is the area of Karnataka here. it's all within you know, a couple of hundred kilometers of Bangalore in South Western India. And when we loook at the different villages, in each village we mapped out a full series of networks, so this is, if you had to borrow 50 rupies for a day who would you borrow them from, so we've got a borrowing question and... I-, if we blow this up a little bit so you get a better picture, then what we've got is each little collection of dots here is a household, and the arrows indicate who they said they would borrow from, so somebody in this household said they would borrow from somebody in this household and so forth, so we end up with a borrowing network We asked a series of different questions, we actually have 13 different networks in total. Who do you go to temple with? Who would you go to for advice? Who comes to you to borrow kerosene? Who would you go to in an emergency for medical help? So we have a whole series of different questions And we can then aggregate these up and, and say that two households are connected. They could talk to each other if they answered yes to any of these questions. And, and we'll, we can work with the networks in different ways, but lets take an undirected version of this, where we aggregate things at the household and say that two households are connected if they either borrow kerosene or would go to each other for medical help, or would borrow rupies from each other, et cetera, et cetera. Okay, so we've got networks. We've got a lot of other information, demographics We've got the microfinance participation over time, number of households and their composition, age, genders, subcaste, religion, profession, education levels, a bunch of other things we can control for. cast information wealth variables, participation rates in, in self help groups and ration cards, voting, behavior in a whole series of other things. Okay? So, so now we want to see whether centrality makes a difference in, in the diffusion of this lone program. And so what we can begin to do is start with say degree centrality, right. So, so you know, here if this were what we saw in a village then, you know picking you know this individual and this individual would be the most central individuals in the village. And if you hit those individuals, you would expect to, to reach more just because they have higher degree. so one hypothesis is that if we look at, in villages where the first contacted individuals have more connections, so higher degree centrality, then there should be a better spread of information about microfinance. and more people knowing should lead to higher participation, so basically high degree centrality of the first nodes, should equal high microfinance participation. Okay, so what do we see in the data? Here is the average degree of the first contacted individuals, which we call Leaders here. So these are the degree of the first contacted teachers, self help group leaders, and shop keepers in the village. And here, on this axis, is the eventual participation rate of the village. So, each one of these dots is a village. So for instance, this village had a 7% participation rate. So fairly low participation. And the average degree of the leaders was about 17. this village over here had average degree of leaders about 21, and a participation rate of 44%. and so we've got a bunch of things. If you fit a best fit line through this, actually it doesn't look like there's any relationship. And if anything, the slope is actually negative. So it doesn't appear as if degree centrality really captures what's going on. Okay, so maybe we need another centrality measure. Let's have a look at, you know, again, when we talked about Eigenvector centrality we realized that looking at degree doesn't tell a lot of the story because it doesn't capture how well you are positioned in a network. And so if we look at Eigenvector Centrality, where we have the centrality being proportional to the sum of the centralities of your neighbors, then we are getting something which reflects this better connectedness, as we talked about in the last lecture. Okay, so let's have a fiat and look and see if Eigenvector centrality does a better job. So, revisit our hypothesis. In villages where the first connected people have higher eigenvector centrality, there should be a better spread of information about microfinance. And more people knowing should lead to higher participation. So let's have a look. And indeed, when we put now the eigenvector centrality, the average eigenvector centrality of the leaders. And plot that against the participation rate on this other axis. Now we get a significantly positive and, and strong relationship. So having better placed leaders in terms of eigenvector centrality does a reasonably good job of predicting the eventual mark microfinance participation. whereas the degree centrality didn't seem to pick things up. And, the idea here is that, why's eigenvector centrality is working better? Because, you know this communication's a repeated process. You tell your friends. They have to tell their friends. And so forth. So if you have well-positioned friends, and they have well-positioned friends, that is good for diffusion. An eigenvector centrality is measuring that whereas degree centrality is not. if you begin to, you know you can do the regression. Regress micro finance participation on a series of variables. If we look at the eigevectors of the leaders compared to the degree of the leaders and regress micro finance participation on these variables. We get positive, and significant relationship between eigenvectors of the leaders and mirco-finance participation. Slightly negative and insignificant relationship of the degree centrality. So indeed, eigenvector centrality seems to be doing a better job. you know, we can look at a bunch of different, notions, so here we look at regressing micro finance on different notions of centrality, so the Eigenvector centrality degree of closeness, Bonacich between this... Here, what I've done also is, is we're also correcting not only for the centrality, but also let's keep track of. You know, some villages are going to be larger, so they might have larger numbers of people. Some might have more people who participate in self help groups, which means they're already more prone to be borrowing and lending from eachother. we have variables on savings, we have cast variables, we can look at a whole series of different variables and control for those and see, you know, that takes some of things out. And again, I can vector centrality, so now degree turns out to be positive and we control for these variables, but still insignificant compared to its standard area... Eigenvector centrality is the one which turns out to be positive and significant the other ones turn out not to be significant. So you know, this is just one application, but it's one application where now if we have a very particular question in mind and we look at which of the centrality measures correlates with the eventual outcome Eigenvector centrality is one that's correlating in a positive way and the other ones are not correlating significantly once we've controlled for a bunch of other variables. So this just gives us an idea that these things are measuring different aspects of the network and sometimes one can be a better predictor than another. Now exactly what the causation here is we can tell stories, I can explain that it probably has to do with communication and better connected friends leads to better communication and so forth. Eigenvector centrality's picking that up. but, you know, this is observational data, so we're not sure what the causation is, but we do see that different. Measures or picking up different things in the data that's going to be important. Now again I want to emphasize here that this does not mean eigenvector centrality should be your only centrality measure. It just means in this particular application where we looking at a very specific type of diffusion it seemed to be a better correlator than these other standard measures of, of centrality, and depending on which application you're looking at, you know, between this seemed to do a little better at explaining what was going on possibly in the Florentine marriage data. So depending on which application you're looking at it might demand a different, centrality measure.