0:00

Hi folks, so now we are going to talk about

Â models of networks that allow us to capture interdependencies.

Â And these are often known as exponential random graph models.

Â So, we went through these different forms of

Â models, last we talked about stochastic block models.

Â And now we are going to talk about the popular set

Â of models which are known as exponential random graph models.

Â And we are also going to talk about some variations

Â on these kinds of models, some new types of models.

Â Which are easier

Â to estimate than exponential random graph models.

Â And, so I'll talk to that, after we get to, to the basics.

Â Okay.

Â So first of all, a quote from Jacob Jevy Moreno and Helen Hall Jennings from 1938.

Â And this sort of, you know, gives us the idea that when we're looking at

Â situations with, we're looking at social interactions and people

Â are interrelated.

Â We're not going to be able to look at just binary relationships diads.

Â We're going to have to be looking at bigger configurations.

Â And, so that is something that is driven home by their original, quote here.

Â That a pertinent form of statistical treatment is one which deals with social

Â configurations as wholes, and not single series

Â of facts, more or less artificially separated

Â from the total picture.

Â And a very interesting thing here, I pulled a, a picture from Moreno in 1932.

Â So, actually the Moreno is also known as the father of sociometry.

Â He's a social psychologist, and here, he was actually mapping

Â out ties between individuals in different houses, in New York.

Â He was at, at Columbia University in 1930s.

Â And this was one of

Â the first sociograms or situations where we've actually

Â mapped out a network in a graphic form

Â of a social interactions between individuals and one

Â of the first ones that I know of.

Â Okay, so, so lets talk about this class of models.

Â So, the previous models we have talked about are not great

Â at fitting data with lots of clustering and other kinds of dependencies.

Â And in particular, in testing many social and economic kinds of theories,

Â which will give us reasons for which people might

Â be, interacting with each other in certain particular, forms.

Â And, so the idea here is that you know, the link between two

Â individuals i and j could depend on the presence of a friend in common.

Â Does, j know k and i know k as well?

Â That's something we want to capture.

Â The difficulty with this is, once we allow one link

Â to depend on another link, then we open a Pandora's

Â box, where now all the links could be inter-dependent.

Â So if, if i and j dependent whether they

Â have a friend in common and the, that friend

Â depends on whether they have other friends in common

Â and also depending on whether i and j are there.

Â Everything gets put back together and so

Â we have to specify all of the interdependencies.

Â And Frank and Strauss in 1986 and they started

Â working with a class of models which became known

Â as p* models and, and were most of them imported into the social networks,

Â literature and, and force by Wasserman and, and Pattison in the 1990s.

Â 3:11

And have, have since become known as exponential random graph models.

Â And let's start by just going through an example of this.

Â And, and this is sort of the simplest possible complication that we could have

Â to given, model where instead we had links before.

Â Now what we're going to do is allow the probability of the network to depend not

Â only on the number of links present, but

Â also depend on the number of triangles present.

Â Okay, so we could say that links with

Â triangles are more likely to appear than links

Â 3:41

networks without triangles, okay?

Â So we're just going to complicate things on one very simple dimension.

Â But one that actually turns out be quite powerful in,

Â in terms of enriching the picture of, that we have.

Â And in particular, this idea that we could have that,

Â a link depends on whether you have friends in common.

Â That's going to, to give possible precedence to more triangles

Â than would appear in something like a Erdos-Renyi random graph.

Â And so, what's the idea behind an exponential random graph model.

Â Now we

Â have the, the probability depends on the number of

Â links we have, the number of triangles times some parameters.

Â So, if we set this parameter to zero, then then that would

Â zero this out and, and all,a ll that would matter was links.

Â But if that's not zero, then we have a situation, where now we have

Â both links and triangles mattering. In order to make

Â things into probabilities.

Â What we're going to do is then, make sure that this thing turns out to be something

Â between what has to, have to be non-negative and lie between zero and one.

Â And so the first thing we can do is instead

Â of just having a probability be proportional directly to this.

Â We'll exponentiate that thing, so it's always going to be non-negative.

Â This is a standard trick

Â in, in statistics, work with the exponential family.

Â So now we have something which is non non-negative

Â or generally going to be positive, so the probability of a

Â graph is proportional to some exponential function of something

Â about the number of links and the number of triangles.

Â 5:11

And one very powerful theorem by Hammersly and Clifford show that

Â pretty much any network model could be expressed in the exponential family,

Â with some counts of graph statistics.

Â So it might not be links and triangles,

Â it might be links, triangles and numbers of three

Â prong stars you have, it could be that it

Â depends on the number of cliques of size six.

Â It could be of quite complicated list of statistics.

Â But the theorem that they had said that pretty much any network

Â model that you can think of could be expressed in this form.

Â And just as a

Â check on this, let's go back to our

Â Erdos-Renyi random network model and see how this works.

Â So, let's let p be the probability of a link.

Â L of g is the number of links in a given graph.

Â What's the probability of g in that world? Well, under the Erdos-Renyi the chance

Â that you had exactly this network full of links, is the probability p that all of

Â the links that are present formed and 1 minus p

Â that all of the links that did not form didn't form.

Â Okay.

Â So, that would be the probability of

Â getting exactly a particular network with L links.

Â 6:28

Then let's rewrite this, so we'll pull out the L.

Â So, I'm just repackaging this.

Â So that we can write this as p over 1 minus p raised to the

Â L, of g times 1 minus p to the n times n minus 1 over 2.

Â And, let's write this now in an exponential form.

Â So another way to write this is, this is the exponent of the log of p over

Â 1 minus p times L of g minus some extra

Â term which doesn't involve the number of links at all.

Â So, what does this look like?

Â This looks like an exponential function of a statistic of the graph,

Â where in particular, this statistic is the number of links in the graph.

Â So you can write the probability of an Erdos-Renyi graph

Â as in this form, where now this looks like log

Â of p over 1 minus p. Okay.

Â So we can see that, I mean, this is

Â not a proof of the Hammersly-Clifford theorem by any means.

Â But what it does give us an idea that you can convert

Â a lot of other kinds of models into the exponential random graph family.

Â And that's going to be quite useful.

Â 7:37

Now one other thing to, to make this a

Â probability of course the, these have to sum to one.

Â So, in particular that means that we have to normalize

Â by what the probability of all the graphs are, to make

Â sure that the probability of one particular graph when we sum

Â across all the graphs, this is going to sum up to one.

Â So the denominator of this probability is going to have to be the

Â sum across all possible graphs of what the probability of those, relative probability

Â of those graphs would be.

Â So now, if we sum this thing across all graphs, the

Â sum of now the P of g's across all g primes,

Â this thing will now equal to 1 because we get the

Â sum to be the same on the top and the bottom.

Â Okay.

Â And also you can write this we know we

Â can bring this denominator which doesn't depend on g anymore.

Â Into the numerator as minus

Â some constant it's just going to be

Â the corresponding constant which captures its denominator.

Â So, what we've got is a, an exponential random graph

Â family, which allows us to capture different types of statistics.

Â And, that's going to be quite powerful in allowing us to

Â fit a lot of things and incorporate a lot of things.

Â The main challenge is going to be in estimating these

Â kinds of models.

Â So now next, in the next video, I'll talk a little bit about how

Â to work with these models to estimate things and to begin to fit these.

Â