Okay, so lets have a look at a fixed co estimation of a model of diffusion. So what I'm going to do is now work with diffusion and actually bring it to some data and see what we can learn from that, and I'm just going to do an application. One I've been involved in, to sort of show you how this might work, and how we can build models. So something that I've been thinking about quite a bit and, and so take you through that. So this is the paper with Abhijit Banerjee, Arun Chandrasekhar and Esther Duflo. And what we're going to do is, we are going to map a social network through surveys, so we have a series of surveys, we've mapped out the villages you've seen before. We observe then behavior over time and we are going to model diffusion and fit a model based on that to try and understand exactly what was going on. And in particular you know, kinds of questions we could think about is you know, what determines behavior generally. So when we think about a diffusion process where individuals are making some choice, do I adopt a technology? Do I buy some new product? In this case, do I end up taking out a loan from a new possibility in terms of micro finance? Is it, is, is, the, when people don't do it, is it because they don't have information, just basic information, they don't even know the opportunities out there, or are there complementarities between individuals, so that I'm more likely to do when a friend ends up doing it for, for various reasons, because either I learned from that or I feel peer pressure or may be there is just benefits from both of us taking out loans and then we could end up learning from each other and, and having useful interactions from that. So that's one kind of question. And another question that we, we'll actually look at here is the role of non-participants in diffusion processes. So, so sometimes you see this in the epidemiology literature as well. It could be, so when we think about something like the flu, generally to pass the flu on you might, you have to have the flu. But there could be people that are asymptomatic who don't actually come down with the disease who could still catch something and transmit it. And in this case, we could ask, is it possible that somebody finds out about the availability of micro-finance loans? They hear about it from friends. They end up not taking out a loan, but nonetheless they still pass information along and are useful process useful in the process of diffusion. So what we're going to do is, is model, take our network model seriously, and then fit that to the data. Okay? So, and here we, we know the set in, in this particular instance, we know the set of initially formed nodes. I'll tell you a little more about that. And then we see how things are going to work and where we are going to model this as initially formed nodes. They are going to pass information to their friends at random and then their friends can pass information on and so forth. So we'll just do a, a very simple diffusion model where we can estimate that directly and then once you're informed people will decide whether to participate. Okay? So the background here. There were 75 villages in Karnataka. Relatively isolated from having availability of loans before. A bank went in, into 43 of these villages and those are the ones we'll look at here. And offered microfinance. Okay, so they started offering loans, and these are relatively poor villages. On the order of $1 or so a day, per capita, income. So, fairly poor villages, and the way in which the information was got out by was by word of mouth. So the bank would come in, identify a few people in the village, and then tell them about micro finance and say bring, inform your friends about those. And we know who the first people they talked to are. We've surveyed the network, the villages and got network information, and then we can track the micro finance over time. Okay, so now how are we going to go about modeling this and, and bringing diffusion explicitly into the picture? Okay, so here's Karnataka, here's the kind of networks we collected. We've seen a little bit of these. You know. If you had to borrow 50 rupees from a day, for a day, who would you go and to borrow from? So the borrowing network here, we have these households are groups of individuals. In fact this is a directed network, so you can see little arrows here. If, so one household, somebody in one household would say they borrowed, they would borrow from somebody else. So households are going to be linked based on, on this. We work at a household level. So when I collapse these, these individuals into households, because you are allowed only one loan per household. So we think of the household as the relevant decision unit. And then we will, we can link up the households based on whether they borrow money, kerosene and so forth. So we have these different who do you go to temple with? Who do you ask for advice? Who comes to you to buy kerosene? Who do you go to for medial help? So what we going to do is, is just build a, a, a, a full network to say that two households can communicate if they have any of these relationships in common. And then we also have micro finance participation demographics, age, gender, sub-cast, religion, wealth variables. Does the house have a latrine? What, how many rooms does they have? What kind of roof does it have? Do people work, participate in self help groups, do they have ration cards, there's a whole series of things and then we have the caste information, that we talked about earlier. And so, before we get into the diffusion model, let's just do a benchmark of the standard way that we would, we might think of doing a peer-effects kind of analysis. So we, we can do is say what's the probability that a given individual participates in the, in the, takes out a loan. And so what we might do in that situation. Without modelling diffusion is just say okay, we'll just do this in a standard logistic form, so the logs that, the probability that take up a loan compared to not is going to depend on their characteristics. So it might be higher depending on which profession they are in or which religion they have or. Whether or not they are in a certain caste group, or of a certain age. And so we have a whole series of characteristics we can put in there, that's going to affect their choice. And then the standard peer effect, we'll say, does it also depend on how many of their friends participate, or the fraction of their friends? So am I more likely to participate, all else equal, if 80% of my friends participate compared to 20%, okay? And we do have to worry about homopholy here, that's going to be an issue behind the scenes, because it could be that part of the reason that I participate compared to my friends is that there's things that we have in common that we're not seeing, and it's not that my friends influenced me, but it's just that I am friends with people who are very similar to me. Okay, so homopholy could be behind the scenes. It's actually not going to be so much of an issue for us because we are going to find it eventually when we properly do a diffusion model, we are not going to find these peer-effects, but it is something that we have to keep in mind. So let's run that standard regression. If you run that with a whole series of characteristics you could find all the details of these regressions in the paper. But effectively what you're going to end up with is, a parameter here of 2.5, highly significant. Which would seem to indicate that the more my friends par, the, the higher the fraction my friends participate, the more likely I am to participate. We can't say causal, but, but it seems to be there's a high correlation there. And in particular, how do we make sense of 2.5 given that we're looking at log of odds ratios. So what does that mean? So if you do some calculations, if, if you took my fraction of my friends from zero to one, holding all the other characteristics at their average, you would increase the odds ratio by a factor of 12. To make it relative likelihood of me participating compared to not, 12 times higher, okay? So, so that, that's a huge impact, and if you took it from just 0.1 to 0.3, which is closer in within one standard deviation of, of what the actual numbers are, you'd still go up by a factor of about 1.65. So you get, you still get a, a substantial impact of just moving one standard deviation in the fraction of friends participating comes to, to a 50% increase in the relative likelihood of participation, okay? So, so we see a very strong effect if we just do the regression, and now the only network information we're using is just in terms of who are my friends, right? So. So basically we, now we're going to try and use bring a diffusion model into this. And get a little more understanding of what that 2.5 really represents or what's going on behind the scenes there. So we're going to use network information, not just with my friends. But we'll keep track of people who hear about microfinance or repeatedly pass information to friends and then once I hear, will make a decision of whether or not to participate. Okay, so we're going to bring diffusion officially into the picture now. So let's stick first of all with the participation decision. So once I'm informed, I'm going to make a decision of whether or not to participate. And, we'll allow the, the choice to basically vary the same way it did before. Okay, so exactly the same, logistic kind of regression we, we had before. The log odds of ratio that I'll participate once I'm informed, will look like something which depends on my characteristics, and depends on the fraction of, of friends I have who are also informed, who are participating. Okay, so now we'll keep track of who's informed and keep track of whether or not I participate as a function of my characteristics and the fraction of friends participating. But what we're doing differently is we're actually going to map out the information flow, and so whether or not I participate whether or not, I choose to participate will be conditional on whether or not I'm informed. Okay. So how are we going to do that? We're, we're going to have just a very simple model of passing information. And so, if I become informed, then I will pass information randomly to my friends. And in particular what we're going to do is, we'll allow for two different pro, probabilities. If I chose not to participate, so if I'm a person who thinks that I don't want to take up microfinance. I'm going to pass information along with some probability q superscript N. And if I chose to participate, so N for not participate. If I choose to participate, I'm going to be allowed to pass it with a different probability. Okay? So we're going to try and estimate what these probabilities are from the diffusion process. So if I, if I didn't participate I pass with one probability. If I participate, I pass with a different probability. So let's look at what a typical thing would look like. So let's call the leader is the first people who are informed in the village. So the bank comes in and tries to find the, the village leaders. It looks for people they think are important. They tell a few of them. And they start with those people and then they say tell your friends about it. Okay, so let's suppose that we had, this is just a snap shot of part of a network and here we see two different people. So one of the leaders chose not to participate, one decides to participate in the loan program. So we've got these two people. Now what are they going to do? Well, they can randomly tell some of their friends. Right? So these are relatively, word of mouth is the way that information's flowing in these villages, so now they can randomly talk to some of their friends. So we, allowing for different probabilities of passing, we might say, okay, maybe if, if somebody's not excited about it, they didn't take out the loan, they're, they're going to talk about it less than somebody who just took out a loan and is more excited about it. So, if those probablilities differ, this person might tell three friends. This person ends up telling only one friend. So, now we've got some friends who are informed. Now, they can look around and this person's going to make a decision. This person's going to make a decision. This person. So, we got four people that are now making decisions. They're going to make those decisions based on their characteristics, but these people all have a friend whose taken up. This person has a friend who hasn't taken up, right? So now we'll get some variation and we can begin to see once they're informed, do they still take up with? Does it still matter whether or not this person is, is their friend has taken up or not. Okay? So we'll still be able to condition on that. And so these nodes decide, some of them decide to participate, some decide not to, and the information keeps spreading and so forth, okay? So it goes on and, and new people become informed. And the idea here these new people becoming informed. This person has half of their friends who've had a chance to take up micro finance participating. This person has 100% of the friends who haven't had a chance to participate. And so we can begin to see after we account for the fact that information is flowing. Through this network. And I'm more likely to become informed because I'm next to somebody who knows about it, who then has another chance to participate. Whether we still get a pure effect after that, that I'm more likely to participate if more of my friends do. Okay? So that's the idea. And we just keep iterating on this model. So what's the estimation technique? So the estimation technique is we going to, first of all, estimate some of the parameters through this logistic just from the initially informed and that saves on just computer's space and the, the three things we are really interested in on are what's the probability that non-participants pass information? What's the probability that participants pass information? And, what does this, peer effect kind of, or endorsement effect, parameter look like once we've corrected for the passing? So, we're going to, allow for the number of times that people can pass information to be proportional to the nubmer of trimesters, that the bank was in each village. Depending on the village that goes from three to eight trimesters, so were either passed three times or passed eight times, you can actually estimate. We also did this by estimating that endogenously, which allows you to another parameter. It doesn't really help that much, often it comes up between four and seven or so. So this seems to be a fairly good estimate in any case. And then how are we going to do this? We're going to try and, so we search on a grid of these, so we'll, we'll, we'll look at various parameters, so, you know, start at probability of 0.0, 0.05, 0.1, and so forth. So we'll march across a grid of different possible parameters. And if for each one we can simulate the model, see what comes out, and then try and match that to the observed moments. So this is a form of general, generalized method moments, in fact simulated method moments here. So for instance what we could do is let's suppose that we set qN to be 0.15. qP to be 0.3 and b-peer to be 0.5. So if we did that and we ran, we just simulate the model now, so we start with the actual data of the, the network, we know who the initially formed ones are, we know which ones. So then we randomly some choose to participate, they pass information depending on whether they participate or not based on these probabilities. And so we just simulate that. And that gives rise to a certain pattern of people who end up being informed and, and participating. So if we ran that out, we'd get some participation rate in the village. Okay? So we end up with some number. Now we, we re-do it. So instead, lets set the non-participants to 0.05 and participants rates of 0.5 and then b period of 0.1. And what that would do is, it would mean that there would be less you know, less spreading from people who didn't participate and more spreading in the neighborhood of people who did participate. So this person participates, it spreads more, and then people are more likely to react to that. So we get, we're going to get a different pattern of data as we vary those parameters. And so what we'll do, is then as we vary the parameters, look for the parameters that produce the most accurate participation data. So if we match with the variance in participation, what's the mean participation, and so on and so forth. So we can have a series of moments, and then try and match those to the actual data. So we're just going to search across the grid, run simulations for each one, figure out which one best matches what, what actually happens. So if you go ahead and do that, then what do you end up with? The diffusion parameters 0.05 and 0.55 in terms of these probabilities. So you're about ten times more likely to pass information along if you're a participant than not, according to this. The difference is statistically significant. So you can do standard errors by a bootstrap method of, of randomizing and, and choosing form the simulations and then seeing what would happen if the parameters were different from the ones that we actually observed. And so here we end up with things being highly significant. When you actually look at the peer parameter. It's not no longer significant and in fact, it's actually slightly negative. So the, the point parameter estimate is negative and it's insignificant. So looks like there is really not that much influence going on and if anything it's slightly negative. You, you can tell a story that maybe. there's, I'm less likely to take up a loan if my friend does, because now I can borrow from them, but this is statistically insignificant, so it's hard to tell whether it's even different from zero. Okay, so, what do we see? We see that, that adding this diffusion model gives us a very different picture of what's going on. It's saying that the reason that we are seeing a lot of correlated take up between people and their friends is not because they're paying attention to what their friends are doing and, and being influenced by that once they're informed. It's that I'm much more likely to hear about information if I have friends who participate. So if my friends participate, I'm getting passed information at a much higher rate, and that's what seems to account for this correlation. So, when you fit this diffusion model, we get a different picture that comes out. Okay? Now, again, this isn't a causal we can't make causal inferences here. We have to be careful. All we know is we fit a model and these, we've got parameters which seem to come out, and to the extent that this mo-, this model captures important features of reality. Then, we're, we're picking something up. it, it seems to be highly significant, it seems to recreate the data fairly well. Whether or not it's the right causal story is something that we'll, we might never know. okay, so network effects, significant information passing. Information passing depends on whether you participate or not. We see some slight complementaries, in this case negative in fact. But insignificant. One question we can ask is okay, the non participants passed at a rate of only 0.05 so one in 20. Do, are they still important? Do they matter, matter at all in terms of spreading information in these villages? So what we can do now, and this sort of makes the point of why is it important to have models we can work with, we can do counter factuals. So we can say okay, suppose now we reran the model, but we muzzle all the nonparticipants. So we don't allow the nonparticipants to talk any more. So we keep everything else the same, but we just say, okay, if you're a nonparticipant, don't tell your friends. And, let's see what happens. And then that allows us to figure out how much did they actually contribute, because we kept everything else equal and we only zeroed them out. So then we can say, what was the marginal effect if the change in the participation rate is going to be entirely due to their passing at 0.05 compared to passing at zero. Okay, so we can do that, next. Okay, so the model as fit, we had 86% of the people who were informed, and the participation rate ended up around 21%. If you rerun it, everything else held constant and just set the non-participants to not passing information any longer, the informedness drops to about 59%. Participation drops to about 14%. So roughly a third drop in each one of these things by just changing that 0.5 to a zero. And what's going on here? Well, even though they are passing at a fairly small rate, there's a lot of people who are non participants. And so, the fact that most people choose not to participate, about 80% of the people end up not particpating. That gives you an idea that they can still be very important in passing information even if they are passing at a lower rate. Okay, okay, so we've fit this model we get significant information passing, insignificant peer effects now once we've corrected for the diffusion process, the information passing depends on whether you are not you're, you're a participant and non-participants still play an important role. Okay, so, conclusions well, what, why do we go through this, what we begin to see is by actually including that model of diffusion and explicitly taking into account the network pattern of diffusion. We got a much better handle on what was going on in the peer effects. And if we just run that regression at the start and got that 2.5 we'd have no idea of why it was that I'm being influenced by friends. And here it begins to tell us it looks more like it's just information passing than actual influence. And that can be very important for policies. So if you want to enhance micro-finance diffusion in these villages, it says that it's basically information spreading that's the more important part and it's not pure influences right. So, so it's not that you need the help over come peer influences, it's more that you need to enhance information spreading. You could also begin to, you know, bring, relate this back to network structure. Would changing homophily structure help information spread better? You could do counterfactuals on this. So once you fit one of these models, then you could actually do a whole series of things by changing things and, and begin to see how things would operate. So, you know, so basically that's just a, a look at one diffusion model that allows us to say something and get a handle on, on different peer effects. And one thing to emphasize here, the conclusion should not be that it's always information passing. That was true in this one instance. Instead we really want to take away from this the methodology. Of, of explicitly, carefully taking into account information passing, separating it out from decision processes, so our diffusion model is a little richer than we were looking at before where it's just simple-like food like [INAUDIBLE] now we allow for these different things. We can map that out. Work with data. Try and see what's going on.