Okay, hi folks, we're back again, and we'll be talking a little bit more about Random Networks. And in particular, we're going to start looking at Growing Random Networks. So, situations where there are new nodes entering over time. And, so this fits into our study of network formation. So in terms of the course, we are now in the second part of the course. And we're looking at Network formation models. And again, we've looked at Static Random Network models. And now we're going to be looking at Dynamic, Random Network models. Where there's growing numbers of nodes over time. And there's lots of examples where this happens in the world. So, the prime example is citation networks. New articles are born over time. New articles can form links, by citing old articles. Old articles can't cite new articles. so articles are going to accumulate links over time. Newer articles are going to have fewer links then older ones, so there's a natural form of heteragenania that forms this way. Same thing with the web, in terms of web pages. Collaboration networks and co-authorships, you have older researchers, younger researchers older researchers will have had more collaborations than younger ones. Societies you known generally human situations you're going to have some older. And some younger and they're going to have different characteristics based, just simply, solely on age. So there's a natural form of heterogeneity. That comes in in form of age. So let's think a little bit about what they add. why do we care about having growing random networks? Well you know, one answer people say okay what's more realistic. And one thing I want to emphasize just from the start is when we build models, we know that we're not going to try and capture everything in the world. And so, realism is not usually a good reason for building a model to, to try and match reality. the, the reason that we build models, is to try and use as few moving parts as possible in order to pro, reproduce the world. So the only reason we want to add this feature of growing random networks is not because that's the way the world is. But because this richer model might capture something in the real world that was not captured in the static models. So again, you, you shouldn't judge models based on realism. They're always wrong. the question is are they capturing reasonable elements and so here. The natural form of heterogeneity versus age is going to be useful in getting out more connected nodes and less connected nodes. And, and starting to see, power laws and, and fat tails. So we're going to get a natural form of dynamics. And in particular, we're going to end up with a natural way of getting different degree distribution without just building them into the statistical distribution. So, we could just assume that there's a distribution that has the, the, the features of reality. But instead we want to do is see if build a model that will end up generating. Features that look like the real world, and this sort of dynamics is going to help quite a bit. Okay,[COUGH] , so in, in order to start this, what we're going to do is start by taking Erdos-Renyi random network situation. but instead, just enriching it to have nodes being born over time. And so that'll be a simple benchmark, that'll give us an idea of some of the techniques. And then we can enrich it to have different kinds of formation processes besides the uniform at random. Okay, so, this idea of growing and uniformly at random, what we're going to have is each day is going to be a new node's birthday. So nodes come in at time one, two, three, four, etcetera. And when they are born, they will form a number of links to existing nodes. And, to start, in terms of this Erdos-Renyi kind of setting. What we're doing, we're going to assume that each mode is going to form it's links, uniformly at random to the ones that are already existing out there in society. Okay so you're born, you decide who to link to, you form some numbers of links and you form them to over the existing node. To keep things simple, we're going to have the, the link formation process only occur when the node is born. So you don't keep forming them over your lifetime. You'll get new ones as new nodes are born and they form their links. But a, a given note only forms it'[s links it, it, it's outward links in some sense when it's born. But it can accumulate more links later on as new nodes are, are forming their links. Okay. So in order to have some nodes that you can form links to when you're born what we're going to do is each one is going to form m links at random. So, we'll have m nodes that are already existing which are fully connected so that when the first node is born it has somebody to connect to. So the, the process is going to start out, we'll start it with a very simple seed of already having some m nodes fully connected. You could start with a whole series of different situations. That's not going to effect the asymptotic limit of it, it might effect what it looks like for the short periods of, of time, until you get to a very large time. Okay, new node is born, it forms m links to existing nodes. Say two, three, four, how, what, whatever, how, however many we want. And what that means, is if I'm already born and, there's some time t then I'm going to have a probability That's, that's roughly m of the, the number of links being formed, compared to t of the existing number of, of nodes that are, are already there, of, of getting one of the new links, right? So, m over t will give, give us the probability. So, the more nodes that are out there over time the less chance that any particular node is going to get one the new links, right? Okay, so very simple process, but, it, it's going to have some dynamics to it. Okay. So, now let's think about sum node i, that was born after the original m nodes that we started with that we fully connect and, and before sum time t. Okay. And now let's ask how many links has it, does it expect to have collected by time t? Well, the first thing is, that it forms some m links when it was born. So it's going to have m links for sure. Then in terms of expectations, the next day after it was born, if it was born at time i, then the date now is, i plus 1. And there's some new node which is born and it's forming it's m links. There's already existing i plus 1 links nodes out there. And so it has a chance m over i plus 1 of getting these new links, okay? And then at time, the next state there's i plus 2. It's going to, a number of new links are formed, and so forth. Right? So, so this is the overall sum of what it's going to get over time. Right? So, we had the first m that it got when, at birth, you expect it from the next node born, and so on. So it has this sum. [BLANK_AUDIO]. Now, if you want to look at a sum like this. So, you look at this kind of sum, what is an approximation for this sum, well these are harmonic numbers. So if you can if you remember your, your number theory, these are what are known as harmonic numbers. So, they're growing proportionately to i plus some number so it's growing proportionately to the inverse of t. And if you sum a series like this an approximation for this is going to be m times 1 plus log of t over i. So, depending on, on how far you go out and when you started this series. You're going to end up with sum that looks like this. So, we have an approximation for what the expected degree is of any node, after some time period t. Okay? So, very simple calculation. And now what can do is we can ask what does this generate in terms of the distribution of degrees in society. Or the distribution of expected degrees in society at any point in time? Okay, so let's do a simple calculation. How many nodes have an expected degree of less than d at sometime t? Well, it's those for whom their expected degrees at time t are less than some d. Right? So if you say okay how many are going to have degree less than 100? Then we can ask that. How many are going to have degrees less than 35? We'll get some numbers. So, at time t, it's going to be the i's for which this inequality holds. Okay, so let's have a look at that inequality in more detail. So, let's suppose that we did this calculation at time 100. So, we've got here our t is 100. And let's suppose that each node at each time is, is forming 20 new links. So, what does the degrees of different nodes look like at time100? So here we have the birth date of the node. So, here is birth date of the node and this is then the equation that we had. So, so, so this is our equation that we had in terms of m times 1 plus log t over i, right. So, this equation right here is m times 1 plus log t over i. So, that's plotted out here. And what that means is, as you can see the older nodes, have gotten more links than the younger nodes. So, the youngest node is only formed at 20 and not gotten many more. Some of the slightly older nodes have gotten a little bit more than 20 because they've happened to get some of the new ones. These ones have been around for a longer time, they have gained more links. And the, the, the reason you have curvature here, is always easier to get links early on. And as more and more nodes are born, it's harder and harder to get the new links. So, ones that are born later and later, are going to have a harder time getting new links than the ones that we're born earlier and could get one. So, you get a natural curvature here, just due to that fact. Okay. So, what do we get? well we can do the same thing for degrees at time 200. And at degree time 200 we'll have a slightly different curve. And you know, the ones that were born at 100, time 100, now I've had a chance to gain more links. Everybody's gained more links. how much they've gained more, depends on what their birth date was. And now we've got the, the 200th, node born has just formed it's initial 20 links, and so forth. So, we're going to get different degree distributions at different times. We'll have a different, a, this is expected degrees over time. we'll have a different distribution as time evolves. Okay, so what does this distribution look like? Well, we can ask then what are the nodes for instance, that have degree less than 35? What's the fraction of nodes? So, let's go back to our time 100, and figure out what is a distribution function. So, what is, you know, how many nodes? What does F of d look like. So, how many nodes have degree less than some d. So, what does F of 35 look like, alright? We can do that kind of calculation. Okay, these are the nodes with degree less than 35, are the ones are expected degree less than 35. there's going to be some node, which has expected degree right at 35. And basically the nodes with degree, expected degree less than 35, are going to be the ones that were born afterwards, right? So, these ones. So, if we want to figure out the fraction of nodes that were born after, that have degree less than 35. It's going to be the ones that were born after this, compared to the overall 100 that exist in the society so far. So, let's go through that calculation. So, what we want is the actual amount that they have, the expected degree they have. Remember that this is m1 plus log t over i. So, here we have t is 100, m is 20. So, these are the nodes for which this equation is less than 35. Okay we've got that, and so if we solve that equation. Then what do we end up with? We end up with, these are the i's for which they're greater than, you know, you just take exponentials of both sides of this, solve out for i. And what you'll get is these are the i's for which, they were born after time 47.2. So, basically the ones that were born 48 or later are going to have expected degrees less than 35. The ones that were born 47 and before are going to have expected degrees, bigger than 35. So, this is 47.2, this point right here, and now if you want to ask, what's the fraction of nodes that have degree less than 35? [SOUND] Well, that's going to be the 47.2. So, we're going to have 100 minus 47.2 over 100. Right? So, this is going to work out to be, 0.628. Right. So, the, the fraction of nodes that are you know, rough, roughly 63%, are going to be the ones that have degree less than 35. So, some where between 62 and 63% in this case have expected degrees less than 25. Okay? So, given this process we can figure out how degrees grow. We can figure out a degree distribution. And, more generally for any degree, the ones that have expected degree less than d at some time. Are going to be the ones for which i is bigger than same solution as we just saw for the 35 t times e at the exponent of minus d minus m over m. Okay. So, if you go ahead and solve that out, what we'll end up with is a degree distribution at time t. That says the fraction of nodes that have an expected degree of less than d, is given by this formula 1 minus e minus to the minus d minus m over m. Okay, so very simple set of calculations, little bit of algebra invovled, some exponentiation, some summation of series. But basically what we're able to do is now calculate what the degree distribution looks like for this growing for this growing random network. this is a exponential, negative exponential, distribution. So what we have is the distribution of expected degrees is such that d minus m is exponentially distributed with a mean of m. what about the actual degrees? Well, a good approximation for large t, we're going to have to you know, so, so here we did the calculations of expected degrees. Right? So we, we, we can say how many each node expected, and we got a nice curve. Well some, some nodes are going to happen to get more, some are going to happen to get few. So, the actual degrees of the true nodes are not going to follow that really nice, smooth curve. They're going to bounce around a bit. And what that means is that the distribution can be slightly different than this distribution of expected degrees. it turns out that this distribution of expected degrees, is a good approximation of what the actual distribution of degrees will look like. Some will end up pushed above, some will end up pushed below. But on average, you'll still expect around 60%, less than 35 and time 100 and so forth. So, if you go through and actually calculate the true distribution function. It's going to look as t becomes large, it's going to be well approximated by this negative exponential function. Now actually doing that, you need some careful law of large numbers argument I'm not going to do that here. I, in the book there's some discussion of how you go about doing that. It's fairly straightforward. Again, it's just making sure that the variation in the actual distribution isn't too large. As you get to a large t, the noise in the system is going to be well approximated by the mean. So, so here we've got a distribution of a growing random network things have worked out well. Okay, so in terms of where we're going next we'll look at slightly richer growing models. We'll also talk about mean field approximations, other kinds of ways of calculating these things. So, we'll at at another set of growing random network models that will allow us to get richer and richer, the degree distributions out.