In this small example, we've got just five no's. So five nodes labeled as one, [inaudible] up to five. And there are altogether ten possible sessions. Okay, Let's say a session between one and two, one and three, one and four, one and five. Then a session between two and three going through this path, like two and four and two and five, And so on. There are all together ten possible sessions among these five nodes. And there are five links five directional links. We can look at the degree distribution here. The degree distribution for this graph is very simple is three two, two, two one. Okay, of course this is extremely small example to illustrate our point. But this enables us to write the matrices and optimization in a single slide. Alright so let's look at call this graph G1. Okay let's look at S of G1 and P of G1. The smaller S of G1 is easy to compute because the sum of the DIDJ's for all the IJ's are connected. For example notice 1,2 are connected so the first term is D1 plus D2. And there are all together five terms here. D1 + D3 + D1. + D5 + D2 + times D4 + D3 + D4. Okay, that's all associated for the left graph. We'll talk about the right graph in a minute. So, you can add this up. For example, the first term is three times two. Okay. With all the terms and then you get a number 23. Now, how do find the big'S' of'G1'. We need to know what kind of graph with a no degree distribution of 3,2,2,2,2,1. Would give you the largest small s. Why if you look at all these terms, you see that the best way to do that is to make a simple swap, okay? Instead of three connect to four and one connect to five, one should connect to four, because then we get a term of three multiply two, instead of three multiply one. Okay? So switch one to five to one to four. And then switch three to four to three to five. You can verify that, in this right hand graph G2 called. It has the exact same no degree distribution 3,2,2,2,1. Again, it's not exactly Pareto but it suffices to illustrate our point of tradeoff between S of G and P of G, Okay? And then you can carry out the computation for small s of G2 is 24. And therefore big S of G1 is, is small s divided by the maximum as you can get out of all the graphs with this degree distribution, which is 24. So, it's 23 out of 24. And clearly, big S G2 is 24 out of 24, is one So, one is less than one the other is one. Therefore this graph, g1 is less likely to be joined at random than G2. Just like internet is less likely to be drawn than preferential attachment type of topology. So now the question is what about performance. P of G1 and P of G2. Let's run this exercise for P of G1. P of G2 is very similar. So we just calculated the likelihood already. Now here's the performance. So first of all, we have to write out this routing matrix, R, which is five by ten because there are ten possible sessions, Okay? And the variables is basically X1, X2 dot, dot, dot, down to X10. And there are five, links, five nodes here, okay? So R is five by ten. Let's say the first, session is the session of session one. Is the session that goes from node one to node two. And therefore it's simply one, one, zero, zero, zero is this call. Because the first session goes through nodes one and two, and no other nodes, okay? And the second session, which is let's say, one goes to three. Again, it's very simple. It's just one, zero, one, zero, zero now. And you can fill out the rest of this routing matrix. Next session is 11010, it's one to four And then it will be one to five which is 10001 cuz one is directly connected to five. And then it's the session two to three now, okay? We'll see that two to go to three. You'll pass node one. So this session the fifth column is the session number five going from node two to node three. And the path will traverse the following the three nodes eleven. The first three nodes one, two, three. So the entries in these columns are 1,1,1,0,0. So you can fill out the rest. And now, with this routing matrix constructed, we say, this matrix multiply, our variable should be less than. The total bandwidth available per router, and out of these five nodes. Let's say the BKs are exactly the same as the degree. They don't have to be, just to make the numbers simpler. And therefore, they are three, two, two, two, one, basically. So the total bandwidth a number of packets can process is just is directly proportional to their node degrees. So per degree, basically, assuming has a unit capacity of processing. So, subject to this constraint, Okay? And, the definition that each x is really row times the demand. And the supply so we can make up some data for these y values, okay? For example, we can just say the y vector is five, two, four, four, two and therefore, for each session, there're ten of them, you can look at what are the sources destination, multiply the number and then you get another constraint. So, constraint one, constraint. Two subject to these two constraints, maximize the summation of x I in this, the objective function for performance. If you solve this very simple linear optimization problem, you get the following answer. The following answer, X star, is some vector actually doesn't matter. Row star actually doesn't matter. It turns out to be 0.033, but that's not key. The key is that the maximized the sum of these end to end sessions throughput, that is the performance, with this graph G1 is 3.73 certain unit, let's say, gigabits per second. . Now we can carry out exact same procedure for this graph G2, Okay? And then you're going to get a different number. It turns out that the performance for G2 graph is 3.03. Gigabit per second which is clearly smaller than 3.73 significantly. And the reason is quite clear, because in order make that most highly connected node to be sort of in the center of this very small graph, your actually creating a performance bottleneck in supporting throughput. So now if I plot the location of G1G2 in the P performance versus S likelihood, nominalized likelihood graph, I see that they follow the same trend as the Internet versus preferential attachment, except this is a much, much smaller example. So we get a very small dynamic range, but none the less, you see a qualitative trade off between performance and likelihood. So that is our small example. And we are going to conclude this part of the lecture. In the advanced material we'll say a lot more about both preferential attachment and constraint optimization as generative models of scale-free networks. And there's a very rich and interesting background. The history goes all the way back to, Zips. And then to Pareto and Yudi. Who basically discovered that for many phenomena, whether it's the size of the city, or the frequency of appearance of words in languages. Many distributions. The [inaudible]. Most popular, by histogram tallying item. Often shows up with a frequency that is one over K. Now, clearly, that is our scale free phenomenon. It doesn't matter what K is, the chance of it shows up is one over K. So there's no specific characteristic scale, just like parietal distribution, as you can see parietal in the history of this evolution. And then, back in the 1950s there was this debate to say all these are observations. The question is, how do you explain the observation. And there was a debate between Menderbroat and Simon. And that debate is very similar to the debate about ten years ago in early 2000 between profound attachment and cost string optimization. Except the context of course, in 1950s was not internet router topology. And very briefly speaking, Before going to mass material preferential attachments says that when new nodes added to a graph, they are more likely to be connected to those with a lot of connections already. So the rich gets richer more connected gets even more connected. Conceptually this is said the self organizing growth mechanism of the graph leads to paralogical distribution And mathematically, just, turns out that sampling, by density leads to a difference equation. We'll see that in advanced material whose solution gives you equilibrium that satisfies a power law. In contrast, Constraint optimization generates power law distribution in very different ways, so that the graph topology is really designed with some objective functions, and some constraints in mind. And yet, resulting topology can also show power law. Conceptually it said that power law is the natural outcome of constraint optimization. With objective driven by economics, for example in constraints driven by technology, mathematically says that either entropy maximization, or what's called iso-elastic utility maximization and the linear constraints. We're going to see these in the advanced material part of the reading for this lecture. Either of them, among many others would give rise to a power law. So there lies, the main differences between preferred attachment, and constraint optimization. So the question is which one to use? Well, it depends on which one gives you the predictive power that you need on other properties. Not the power law distribution anymore, because they both generate that. Other properties of the graph, such as robustness to a target attack of highly connected nodes. Would that break the network? Would that break just the edge? Whichever, generate a model can also have a predict on other properties that you care about, should be the one that you use. And there lies an interesting story on the pitfall of generated models. In 2000 There were so much talk about the internet having an Achilles heel. It turns out that if you look at empirical data, you go talk to At and t or Cisco-Juniper you'll realize that, that's just now supported by factual data. It highlights the importance of fortification of any self consistent theory through empirical data of the field that you are talking about. It also highlights the risk of over generalizing something called a network science, which is a fuzzy and mostly, a meaning less term, unless you provide some concrete meaning to it By overgeneralizing it to some universally true property it actually loses domain-specific functionality models. And thereby often loses predictive power and even lead to confusing misleading conclusions. Such as the non-existence of this Achilles heels So we have seen generative models. Reverse engineering of network apology, small whirls and scale free, and later we'll also look at reverse engineering of network protocols. We have also seen. The interplay between topology, say a graph and the corresponding matrices, versus functionalities. Whether there's navigation or search. Routing or robustness to a specific kind of attacks. And, the interplay between these two dimensions of what we call a network whether socio-economic or tech network will continue to show up in the rest of the course. As we conclude this point. The midpoint of this twenty questions about our net (for) life.