0:00

Okay, hello. So, now we're looking at Centrality

Measures. We're going to talk about positioning

nodes inside a network, and understanding how they're positioned.

And in terms of the way we've been going through our definitions and trying to

understand structure of networks and so forth.

what we've done so far as we talked about, you know, basic patterns of

networks,degree disrtibutions, path lengths things like that get to raise

works. we talked a little bit about homophily

and effect that there can be segregation among nodes.

we talked a little bit about Local patterns, things like Clustering and

related concepts of transitivity. We'll talk a little bit about, you know,

how, how many of links are actually supported so they have friends in common

and so forth as we go forward in the, in the course.

so these are things which characterize the network itself and we'll also be very

interested in understanding how different nodes are positioned in a network.

And so how we can talk about whether a node is important or not or influential

or central, powerful, etcetra. And so the idea of how to describe a

position in a network you know, there's different aspects of individual

characteristics, some of which we have already talked about.

you know? Just, how, how connected it is.

How clustered its friends are. distance to other nodes.

but more generally, trying to capture centrality, influence, and power are

going to, you know, build on, on specific definitions which, keep track of, of a

node's position. So, in terms of looking at nodes

centrality. the most basic measure in just trying to

figure out how important a node is, or how influential it is and so forth.

It's just how connected it is and that's captured directly by degree.

So degree captures some notion of connectedness of a, of a node.

And you know, in order to make it a measure between zero and one, we can just

keep track of dividing through by n minus 1 the most possible links I could have

and then what fraction of people am I connected to compared to how many I could

be connected to. so if we look, you know, for instance at

the Medici, the, the Florentine data we had before.

here, what do we see from, from the different families we see that the Medici

here have a degree of six the Strozzi, degree of four, Guadagni, degree of four,

Albizzi's, degree of threes, so some of the most important families, the Medici

were better connected in terms of having higher degree.

It's not an enormous difference, but there's some difference there.

so degrees capturing some of what goes on.

[BLANK_AUDIO] but degree is actually going to miss a lot of what's going on

in, in terms of a network. And you know for instance here, degree

you know node three has the same degree as node one or node two, and in some

sense we might, just looking at the network think if three as being less

central than some of the other nodes. And how do we capture that fact that, you

know, degree isn't really gathering all of position.

It's just saying, you know, how big is your local neighborhood.

It's not saying where you are positioned in the network, or how central you are in

a, in a deeper sense. So in order to get at things like

centrality, we'll have different types of things that we can think about capturing.

So what I've done here is, sort of, break things down into four different

categories. And so degree is really just capturing

basic connectedness. Another thing we might worry about, is

how close you are to other nodes. So, closeness centrality measures, and

what we'll look at is in terms of decay, is sort of an ease of reaching of other

nodes. So, how far am I on average from other

nodes? between this, we talked about very

briefly, we'll look at that in a little more detail.

Role as an intermediary or connector. So are, do other pairs of nodes have to

go through me in order to reach themselves?

That's a very different concept then, then thinking how close I am to somebody

else. This is saying is, am I as important as a

connector of, of other individuals. then there will be a whole series of

influence or prestige or eigenvectors kinds of, of notions, which we'll try and

capture the idea that your are important if your friends are important.

So being well connected is something which depends on the connectedness of

one's friends. And so you know this is the old it's not

what you know, but who you know it's not necessarily important to have more

friends but to have well positioned friends, we'll take a look at the

definitions which capture that. So we have sort of four different

concepts of, of centrality or power and we'll try and incorporate these into

different definitions and see, I don't know, the differences between these

things. And one thing to emphasize here; there's

lots of different measures, and not one it, it, it, there's not one which is

best, in a sense that it dominates. These things are capturing different

ideas, different aspects of a position, and some of them are going to be more

important in making predictions in one setting than another.

And so, what we really do in terms of, of using one of these things, it's going to

depend very much on the context as to which one was important, most important.

Okay. So let's have a look at Closeness

centrality. So, Closeness centrality one basic

definition of it here is just to look here, this is the length of the shortest

path between two nodes i and j. And we can sum across all j.

So how far am I away from all the other nodes?

And then look at n minus 1 over this and it keeps track of sort of relative

distances to other nodes. And so the idea here is that if, if these

are very large numbers, then my closeness centrality is going to be a very small

number. so I'm dividing by larger numbers, it

makes this small. So how close I am to other people the

closer I am if, if I'm a distance 1 from everybody, then this thing normalizes to

1, and otherwise it, it's going to become further and further.

this scales directly with distance so twice, twice as far from everybody makes

me half as central. right, so if I double all these things,

I'm going to get half. if I quadruple them, then I'll get a

quarter. And so forth.

So, it's scaling, with the distance. When we look at the closeness centrality

back in our, Medici data again. ignoring the Pucci now because if we add

them to everybody and we think of everybody has being infinitely distant

from them, then everybody would have closeness centrality of zero.

So, if we ignore them and just look at the remaining network, then the, the

Medici are 14 out of 25, Strozzi 14 out of 32, Guadagni 14 out of 26, and so

forth. So here you know, closeness centrality

gives us some differentiation between different families.

It's not it, it, it doesn't sort things enormously.

it gives us some feeling for who's further and, and who's closer.

another measure that we could use instead, is what's known as decay

centrality. And this is, designed to capture the idea

that, what I, I might get is, is value from being connected or indirectly

connected to other nodes. So I might have some value from a friend

a different value from a friend of a friend, and so forth.

And so the idea is that there's going to be some delta factor which is, generally

less than 1 and bigger than 0. And the centrality then of a node i under

this decay notion, is going to be, look at the distances to other nodes, and

raise delta to that power. So if I'm a direct friend I get a delta.

If I'm an indirect friend, distance 2 from somebody, I get delta squared.

distance 3 delta cubed. So if this were 0.5 then we're going to

get 0.25 here. and, and, and so forth.

0.125. if this were 0.9 then these numbers would

be much closer to each other. If this was, you know, 0.05 then this

would be 0.0025 and so forth and so, so it would scale more dramatically.

So as delta becomes near 1, then this just sort of counts all the people that I

can reach indirectly. As delta goes close to 0 then this is

just going to become degree centrallity. All it's going to do is, is really

emphasise the direct connections and all the other ones are going to be much

smaller. And then somewhere in between it's, its

going to weight indirect connections compared to, to direct connections.

So you can think of varying this delta, as sort of how much do I think of, of it

being important to be close to many people, or, of how much do I get from

indirect connections, of different varying lengths.

[BLANK_AUDIO] . Okay.

So, you know? Here's a network, with, you know, sort

of, like, bow tie kind of network here. we've got, you know, node 4 in the

middle. Node 3 over here.

Node 1 over here. basically there is three different types

of nodes, so nodes two, six and seven all look node 1, node five looks like node 3.

So we're going to just keep track of these three nodes and their centralities.

if we look at the degree centrality, then node 3 is a, is the most important in

some sense. Its got three connection as supposed to

two for the others. Closeness centrality, node 4 is actually

the closest. So here it wins out in terms of being

able to reach all the other ones in, in shorter paths.

Decay centrality depends, if we do 0.5, then these two are, are basically equal

to each other. If we raise it to 0.57, then node four

ends up doing a little better. If we drop it to 0.25, so that more

immediate connections matter, then node three starts to do better, so you can

begin to see that these different, Definitions are going to give different

positions in terms of importance to different nodes, depending on which kind

of centrality definition your looking at. you can normalize decay centrality, by,

dividing through, by you know the, the lowest possible decay you could have to

each one of each node. So it's n minus 1 times delta, is the

lowest possible. and, you know that gives you sort of the,

the numbers, we looked at, before. So, you know, normalizing, you can get

different numbers here in terms of, you know, what these numbers would be, so

that's just readjusted by a normalization.

Okay so looking back at between the centrality that we looked at before, so

now the formal definition of between a centrality due to Freeman.

So the idea here is that when we look at two nodes, i and j, we can keep track of

the full number of shortest paths, the short, the number of geodesics between i

and j. And then for any k that's not equal to i

and j, we can ask what's the number of those shortest paths that k lies on,

between i and j? Right, so if we're looking then for the

between the centrality of a node k, we can look at all pairs i and j that aren't

equal to k. And look at what's the number of shortest

paths between i and j that k lies on compared to the number of shortest paths,

that exist between i and j. And then we can normalize that by the

number of alternative pairs of nodes that we can be considering, and how, you know,

the, the most you could be is, is to have b on the shortest paths of all of those.

So we're going to normalize by there's n minus 1 times n minus 2 over 2 or this is

n minus 1 choose two other pairs of, of nodes that are out there.

Okay. So when we look at that calculation,

we're basically saying what's the fraction of shortest paths that k lies on

between other nodes. And then when we look at that again what

we saw was that the Medici now have a much higher number than the others other

families in terms of their centrality. And between the centrality captures this

idea that you know, at this point in time, if other families wanted to deal

with each other they might have to go through somebody that they were connected

with. So if it's difficult to enforce

contracts, then maybe you have to go through somebody you know in order to

deal with somebody you don't know. And then the Medici could be powerful

intermediaries connecting other families on pairs between them.