And if the GPS distance between the homes,

is less than the median distance between homes, okay?

So going to a village, look at two

households, we say okay are they linked or not?

what, are they of the same caste?

Is the GPS distance greater or less than the median distance?

So when

we're looking at two households, we're say they're in similar if they have the same

caste and less than the median distance

between home, and otherwise we'll say they're different.

Okay, so if they're either of different castes or

greater distance than the median, then we'll put them.

So we're just going to make it a really simple model where we either keep

track of nodes, and we'll allow for

two probabilities, probabilities for nodes being similar and

nodes being different.

And similar here means they're very similar on

both the di, dimensions of caste and GPS location.

Okay?

So now what we can do is, we can fit a block model.

So we can say, what's, we'll allow

block model where we have two different probabilities.

Probability of a link of both of the same category

or similar to each other and probability if they are different.

And then we also fit subgraph generation model.

We're now what we're going to add in is also

triangles, and we'll allow triangles to have two different probabilities.

Probabilities of triangles for people that are all similar, and probability of

triangles if some of the people involved are different from each other.

Okay?

So we'll fit the block model, fit this sub-graph generation model.

Both of these are very easy to fit here, right?

So we can fit,

the block model's a special case of a SUGM where we just look at links.

So we can just count up lengths, count

up triangles, count up whether they're same or different.

So we're going to have four different counts and that

will gives us estimates on all these things, okay?

And the block model just looks, links,

ignoring whether they're in triangles or not.

This subgraph generation model keeps track of

triangles separately from links and estimated that way.

Okay, so that's the basic estimation technique.

So we estimate these block model. Step one.

We're going to estimate this probability of link,

probability of link if you're same or different.

Sub graph generation model we'll do the same

thing but we're going to add in triangle counts.

And then once we have these, the nice thing about these

kinds of models is then we can generate back networks very easily.

So how do we generate a network?

Well once we have this probabilities there, we can just take

this set of nodes, pick pairs, flip coins, put in links with probability same or

different depending on whether they're the same

or different, and then generate a, a network.

For the SUGM what we can do is randomly pick triangle, randomly pick

links and put them in with

these probabilities and then we generate networks.

Okay?

So we randomly generate these networks and then we try

to see whether or not these networks recreate the actual, original

observations. Okay, so here is what we get.

So here's the data.