Hi, this is Mat again. And now we're ready to start talking about how we can represent networks and talk about their properties. And in terms of simplifying the complexity that we're facing in describing networks we're going to think about different kinds of patterns that might be. So one is going to be that there's global patterns of networks. So overall big picture items. So, how are the different constructiveness of individuals distributed through the society? Are there some people who are really well connected and, and serve as hubs and other people who are not so well connected, or is everything very evenly distributed, so that's something. Path links. How far does it take to get on an average from one node to another. There's going to be segregation patterns, so if we begin to think of nodes as having characteristics, say if people's ages, their genders and so forth, do we see separations between nodes of different types. Or are things fairly well integrated. local patterns. do we see tight clusters of nodes that are all tightly connected to each other, cliques. Do we see that if I'm connected to somebody else and they're connected to a third person that I also tend to be connected to that third person. Transitivity. do relationships have friends in common? Are they supported? so we'll look at local patterns, little, you know, zooming in on particular parts. Well also talk about nodes positions in networks so how central is somebody, how influential are they, what does their neighborhood structure look like? So there will be different ways that we'll be talking about networks overall and we'll also be looking at nodes in positions in networks and other kinds of things so there's going to be a series of definitions that we have to go through to keep track of these things. So the basics in terms of representing networks in, there's going to be some set of nodes, vertices, players, depending on what literature you're looking at these will have different names. I might go back and forth between the names and the course. Sometimes I'll call them nodes, sometimes I'll call them agents, sometimes I'll call them players, vertices, etcetera. But there's, tend to be some finite number, little and will be our typical characterization of how many there are. And one basic way of representing a network is just by what's called the adjacency matrix. So it's going to be a matrix of zeroes and ones so an n by n matrix, and what it indicates is 2 nodes connected to each others. So gij equals 1 indicates a link or a tie or an edge between 2 nodes i and j. Now generally in less otherwise stated in the course we'll be dealing with ones which are not directed, so if i is connected to j, than j is connected to i. So if we're friends with each other, we're both friends it's a mutual relationship. Or if we're allies with each other, we're both allied to each other. It can't be that one country is ally to another, and not vice versa. So we'll tend to look at it that, that way and we'll tend to, to work generically with zeroes and ones as, as the structure of it. So there's either a relationship or not. But we could allow for directed networks. We could allow for weighted networks. so when we looked at the financial relationships, so the amount of debt that was held inside one country of the sovereign debt of another country. That's going to tend to be a directed network and it's going to be a weighted network. So it's going to have different intensities on things. when we think about an alternative notation it's going to be very useful for representing networks. Sometimes we'll just use instead of thinking of an adjacency matrix we'll think of, of representing a graph or network. By just listing all the relationships that are present. So a notation will, will have, is it will say that i and j is in g, if that means that there's a link between nodes i and j. So very simple, succinct notation will just keep track of the sets of links that are present and depending on whether these are ordered or not ordered they'll be directed or not directed. so a network is going to be a pair of a set of nodes. And either an adjacency matrix, or a list of all the links that are present depending on the particular context. Which was it's easiest to represent the network. Okay. so, basic definitions. 1 thing we're going to want to keep track of is, sort of. ways of navigating through a network, other known as walks or paths. a walk in a network from, say, node i1 to node, ik, is going to be a sequence of nodes. i2, i2, blah, blah, blah, through ik, and the sequence of links, i1 is connected to i2, i2 to i3, and so forth, ending up at k, such that each one of those links is in the network g, okay? So a walk in a network from one node to another is a sequence of links that take you from that first node to the last node. so [INAUDIBLE] often in, in this kind of setting it's just going to be convenient to represent it just as the corresponding sequence of nodes such that we know that each subsequence pair are connected in the network g. So in terms of a picture well before, before I go to a picture, let's also talk about paths, cycles and, and geodesics. So a walk is going to be some set of, of links, each connecting to another node but they can possibly cycle back on each other. A path is going to be a walk where each node is distinct. So we start at i1. We go to a new node, i2. Then we go to some node, i3, which is not in the previous sequence and so forth. Until, eventually, we reach ik. A cycle is going to be a walk where we end up back at the node that we started at. So the n node is the same as the starting node. one other term that's going to be very useful, geodesic. A geodesic is a shortest path between two nodes. And shortest means we just count how many links there are in that path and we want to find what's the fewest number of links we need to go from. Node say 1 to node 7. How many links do we need to get from one to another? So in terms of pictures here's a set of networks on 7 nodes and the first one represents a path and a walk. From node 1 to node 7, right? So we go from 1 to 2, 2 to 3, right? So we start 1 to 2, 2 to 3, 3 to 4, 4 to 5, 5 to 6, 6 to 7. Okay, so that's that's going to be a path which is also in this case, a walk. if we look at a walk that's not a path we could have from 1 to 2, 2 to 3, 3 to 4, 4 to 5, 5 back to 3, so now we've hit 3 twice and then gone to 7. So that's a walk that's not a path. And if we wanted the geodesic in this case from 1 to 7. Then the shortest path would actually have gone just 1, 3 to 7, right, so this is a path which is longer than a geodesic. When we look at cycles, 1 to 2, 2 to 3, 3 to 1, that's a cycle, it's also known as a simple cycle because it just, the only node that repeats itself is the first and last node. And a cycle and a walk, which is a more complicated one, is we could go from 1 to 2, 2 to 3, 3 to 4, 4 to 5, 5 back to 3, back to 1. We cycled, but we actually cycled through 3 and 1 in that case. So it's a more complicated cycle. So these are different terms, we'll talk about cycles as we go through. They're going to be important. Paths are going to be important. Walks are going to be important,. So these are distinct items in that works and it's going to be important to keep these definitions and this terminology in our heads as we go through. So if we want to count how many walks there are in a network. Let's look at a simple example here on 4 nodes. So this is the network we're looking at. So 1 is connected to to 2 and 4, 3 is connected just to 4, 4 is connected to everybody. So that is represented here, in this particular adjacency matrix. And generally, we're not going to have nodes connected to each other. So when we look at the diagonal, we'll have diagonal being 0's. there'll be some applications where it's going to be useful to have nodes connected to each other, or to themselves. For instance, if. If we're doing learning, I might be able to learn from myself that will be one application where we'll have non zero entries on the diagonal. But generally we're going to have situations where this will be the kind of, matrix we're looking at. Now if we square this matrix, so we raise it to the second power, so what we're looking at here is, so there's a relationship between 1 and 2 captured by this entry right here of a 1. And so that indicates that 1 is connected to 2. And so if we ask how many ways there are, how many walks of length 2 there are from i to j. Then, the, when we square the matrix, that actually gives us the answer of exactly how many walks there are of length 2 from node, whatever to whatever. So this is the number of walks of length 1. This is the number of walk, walks of length 2. If you take it to the 3rd power, you'll get the number of, of walks of length 3. And here it says that there's two different walks of going from node 1 to node 1. I could go up to 2 and then back. I could go to 4 and back, and you get that by looking, so why is the 2 coming out in this part of the matrix? When you multiply the matrix it says I could go from 1 to 2 and then 2 back to 1. So that gives us one entry. If I could go from 1 to 4, and then 4 back to 1, and when you multiply the matrix, it'll pick up the 1 times the 1 plus the 1 times the 1. It gives us a 2 in this particular entry. So when you multiply the matrix, it gives you how many walks of different path times. So now we've got you know path of length 2 from each node to each other node. so it's impossible to get from node 3 to node 4 in a walk of length 2. All the other ones you can get by actually except, and also from 4 to 3. But all the other ones are possible. And if we keep raising these to powers. Then we end up with you know, how many walks of length 3 from node i to node j? And so forth. And so this is a very useful thing to keep track of because it's going to help is in defining centrality, it's going to help us in keeping track of diffusion processes, and other kinds of things where we look at information moving subsequently from node to node in a network. Okay. So another thing that's going to be very important to keep track of in a, in a network is its component structure. What are its components, these are the connected sub graphs that make up a network. So we'll say that a network N, g is connected if there's a path between every two nodes. And a component of a network's going to be a maximally connected subgraph. What do we mean by a, a maximally connected subgraph? We're going to mean that the we look at a subgraph that's a subset of nodes and a subset of the, of the links in the, the network, so that we have those being, of course, bonding subsets of the original nodes and original set of links. We want this to be path connected. So from every node i in N prime, and every node g in g in N prime there exists a path in G prime connecting those 2. So this is a connected subgraph. And if we look at somebody, some node who's in N prime and any link in the original network then any link that we find in the original network if there's some node j at the end of it, then ij has to be in g prime as well. Okay. And so what does that, what does that mean? in terms of. A picture when we look at components then this part of the network is a component. But this part of the network is not a component. Why is it not a component? It doesn't satisfy that last part of a definition. Because we've got 5 in there, 5 is connected to 1 and yet we didn't include 1 in our, our little picture here. So we have to include that and, and so it's a maximally connected subgraph that would you have include 1, 4, 3 and 5. And it would have to include all the links there. So component is going to be a maximally connected subgraph. Another component here it would be 2, another component 6 and 10 together with air link And then, 7, 8 and 9 together with air links. So this is a network with four different components and we can keep track of those. So a simple concept that's going to be useful in understand how, again, diffusion works, learning, etcetera, in understanding how separate the network is in terms of different component structures. [BLANK_AUDIO] Okay, so that, that does it for this lecture. our next lecture is going to start looking at path lengths and diameters.