Okay, so were back and we're going to talk a little bit more about computing the size of a giant component in a random graph. And in particular we'll focus in on Poisson or Gn,p random graph random graphs and so were doing giant component calculation. And were going to do sort of a huristic calculation but one that turns out to be fairly accurate. And the idea here is, is we'll just do the following kind of calculation. let's let q be the fraction of nodes in the largest component. Okay? Now let's look at a given node. And if we just pick a node at random the chances in there is q. but another way to figure out that, that chance that a a given node is in the giant component, it has to be that all of its neighbors are outside of the giant component. Or if it's outside the giant component, or if it has any neighbor inside the giant component it'll be inside the giant component. And so later give this an ability to do a calculation. So, the probability that, give a node outside the giant component. So its not in a giant component, is 1 minus q, since q is a probability that its in there. But that's also equal to the probability that all of its neighbors are outside. So, if the node has degree d, if I have d neighbors, then the chance that each one of those is outside of the giant component is 1-q, we raise that to the dth power. And we end up with 1-q, raised to the dth power, is the chance that it's outside the giant component. And, so what we can do is, id just look at the expected degree taken expectation over this. based on the probability of it having different degrees, and then finding an equation based on that, okay? So, we have this q, we're going to try and solve for q. So, the probability that you're outside the giant component, that's 1 minus q, it's also the probability that all of your neighbors are outside your giant component. You have d neighbors. What's the probability that you have d neighbors? Well, it's given by degree distribution. So, P of d, here, is our degree distribution. So, this is the chance that you have d neighbors. and once we've got a degree distribution, then we can solve for q, okay? Now this is a calculation that you could do a it's a heuristic calculation because I'm not worrying about correlations. And in some of these variables I'm treating as if they're independent there not quite independent you can do a more exactly calculation using generating functions. there's some of that, in, in the book. But this will give us a fairly good approximation and, and work fairly well, especially on large networks. so what we want to do is solve for q. Now, this is an equation, so far that doesn't require us to work with a spec, specific degree distribution. You could plug in any degree distribution you wanted into this and then solve for what q is. It's a non-linear equation, so it's going to be a little difficult to solve for a closed form for q, but you can plug in whatever degree distribution you want. And then solve that for q or get an approximate solution for q, as a function of your degree distribution. So, this is a nicely well-defined equation and basically when we look at this, there is always going to be a solution of q equals 0. So, there is one solution to this equation, which has q equals 0. Right, so we'll, we'll basically solving. this equation. One possibility is well, if we put in q equals 0 here and q equals 0 here, then we're just summing across all degrees for the, the probability of d. We get one on both sides. That's always going to be a solution. The interesting solution to this is going to be the one where we have a non zero q. So, we want to solve for a non zero q. So, one possibility is we assume everybody is outside the giant and component and then everybody ends up outside the giant component because all their neighbors are. That's a nonsensical solution in a lot of situations, so the one we want to solve for is going to be the one where there's a positive solution to this equation. So, if we look at this and try and solve for the positive solution, it's going to be the point where 1-q is equal to the sum of the 1-q to the d over the different probabilities of d. Alright? And so we're trying to solve for what, whats, what 1-q solve this simultaneously, that gives us the solution to q, so that's what we need. And so you could solve this for different degree distributions. Let's do it for the case of Gn,p or approximately a Poisson Random Graph, so let's plug in a Poisson distribution for P of d. So, we have the Poisson distribution here, so this is the probability that a given node has degree d as given by this formula. And when we plug that in, then we end up with this equation. So, now we can solve this for q. And this looks like a daunting equation. Its got a sum over a factorials of d so again this is a sum over different d's. We've got factorials here it looks like a fairly nasty equation. but luckily there's some nice parts to this equation. So, when you look at things that look like sums of some expression raised to the d over d factorial, that actually has a nice exponential form. So, that sum is a nicely behaved sum. And one useful approximation, just to have in the back of your mind. is that when you look at writing down a Taylor series approximation for e to the x. that looks like sum of x to the d over d factorial. So Taylor's series approximation tells you that e raised to some power of x is the same thing as summing x to the d over d factorial. So, you've got a nice expression here and in particular when we go back we looked like we're summing something to the d over d factorials. That's going to look like an exponential. So, when we plug in that expression, now we say we can re-write this, given our formula as exponent of, of what's inside here. So, we rewrite in the nice form, so that now we got 1 minus q is equal to this expression here. Okay? And collecting terms some of our p simplify here, and we end up with e to the minus q times n minus 1 p. Well remember that n minus 1 p is the expected degree, right? So, p is the probability of any link. Multiply that n minus one times. That's the expected degree. So, what we end up with is here. 1 minus q is equal to e to the minus q expected degree. And if we take log of both sides. We get log of minus log of 1 minus q over q is equal to E of d, so very nice, simple equation now. We've got a relationship between the q on the one side and the expected degree on the other side. Okay so, now we've got an expression that tells us how large the giant component is, as a function of the expected degree. It's not quite a solution directly in q because its minus log 1 minus q over q but what we can do is at least plot this out and see what it looks like. Okay so if you plot that out as function e of d's and then see what q's look like basically what happens is, as q gets close to 1, there's no giant component. Right, so q's very small at one then the giant component grows. And by the time you get up to three as an expected degree your, your queue is actually very large. It's already close to 95%. So by, by a degree of three we've already hit close to 95% in terms of the size of the giant component. And at one we're close to zero. So this is interesting. And this, you know, some, somewhat characteristic of these Erdash-menu random graphs. We get these tight phase transitions. So, if people have expected neighbors, less than one, expected interactions that would transmit a diffusion of less than one, we don't get the fusion. Once you hit one, we get a diffusion. Then it grows fairly rapidly. And by the time you have three interactions, then were already to the point where you're very likely to hit most of the population. So, if you remember in our promo video, I said tell at least two friends or if you tell just one friend about this then we wouldn't get much diffusion. two friends at least enough to start pushing us fairly far along this and you get to three friends and, and you're most of the way there. just in terms of giant component size here we are we see that there's fairly. Tight transition from having no giant component to having almost everybody being connected. And so most of the play is in this one region. And, indeed, when people study epidemiology, there's this idea that, having at least one contact that's going to transmit something. If people are doing more than one. You're going to get diffusion or contagions and if people are having less than one transmission then things are going to die out, and that makes perfect sense. You need at least more than one and it's growing, fewer than one and it's shrinking. That's the critical point and then here we find that it actually transitions. Very rapidly so that by the time you got three transactions three interactions that are likely to lead to, to contagion then our diffusion is quite extensive. Okay, so given that, we can also do calculations of what's the probability of being a giant component. Well, this was the probability that you weren't in a giant component. Probability that you are is 1 minus that, this is increasing in d increasing function of d and basically that tells you more connected your more likely to be infected. And this is exponential in d it tells you that increasing your interaction rate, makes it much more likely exponentially more likely that you end up in a giant component. So, you're more likely to be infected at any point in time if we do a dynamic and much more likely to end up infected. And this gives us some insight behind that Coleman, Katz and Menzel finding, that the people with more interactions were adopting the. Antibiotic, prescribing the antibiotic at an earlier time. So, here we see this directly in terms of this model. Okay, extensions let's talk about some extensions quickly. immunity. Well that's easy. if some nodes are immune then they can't catch something, they can't transmit it. so we can just drop those out of our model to begin with. And then study the giant component on the remaining nodes. And what that's going to do is actually decrease the degree, right? So, if we have some fraction of the nodes that disappear then that's going to decrease the degree, and that decreases the overall interaction. You know, in terms of probablistic infections, you know we just have links fail and we can incorporate that directly into the. The link probability. So, we can think about you know some nodes initially exposed to infection. we can have some fraction pie of the nodes that are immune. And we can think also that maybe only some of the links are actually going to transmit. Lets let a fraction f of the, of the links transmit. And then we ask what's the extent of the infection. And basically we can just start with our, our network on N nodes. We delete some fraction of the nodes, delete some fraction of the links, and, we look at what's left. And if we get an infection in a giant component on what's left, then, we'll get it to reach the whatever the giant component is. On what's left and otherwise we don't get an infection. So, it's a very simple extension to start to incorporate these things. And you know, if we let q be the fraction of nodes remaining on the network on the remaining network in a giant component. Then q times 1 minus p is the probability of having a non-trivial contagion. And that's also sort of the, the extent of the infection. So, basically what we can do, is just re-solve the equation. Now what we've got is going to have a lower expected degree for the individuals. plus moving that down. So, lower given the fact that they're not going to have as many neighbors because some of those neighbors are immune. And then also scaling down by the f. And once we've done that, we end up with basically the same exact equation just taking these two modulations into effect. And so what we end up with is a similar curve as we had before, but now what we have is on the, the x-axis. The expected degree, the one that's sort of relevant here, is now module modulo, the fact that some of the nodes are going to be missing and some fractions aren't going to transmit. And so it has to still be at least one out of what's remaining in this, and then going up to three. So, you can just direct calculations, this just scale things back. So, implications the infection can fail now if pi is high enough, or if f or p are low enough. So, that basically puts you below the one. so high pi, high immunization. low f, low contagiousness of the, of the process. Low p, low contact among the population. So, you can think of, of these different modulations to to, to what we had before, and that's going to affect the eventual process. Okay, so that takes us through a quick look at component size and doing these kinds of calculations. This has been for a process that just spreads through one network. The next thing we're going to look at is an ongoing process where people can sort of change back and forth between. State 1 and 0. So, maybe you support somebody, maybe you don't. You like technology, you don't like it. So, you can change your mind over time or there are certain kinds of diseases that people can catch and re-catch. And so, these are ones that can last in a population. So, I can clean my computer of a virus, get the virus back, it can infect other people. and so fourth, so I can, I can go back and fourth between the states zero and one over time and we can look at that kind of model as well. That will be the next thing we have a look at.