0:00

Welcome to Lecture eight of Networks, Friends, Money, and Bytes.

Â Today's lecture we'll try to formulate and answer the question of, how do I influence

Â people on Facebook and Twitter? Now, network consists of both the topology

Â represented by the graph and functionalities which are the tasks

Â carried out on top of the graph. In this chapter, we talk about topology

Â dependent influence model in contrast to the population dependent and topology

Â independent influence models in the last lecture.

Â As most of us know very well that Facebook and Twitter have already become among the

Â dominate communication modes, especially among younger people.

Â Facebook started in late 2003, and by the time of spring 2012 of its IPO, it's got

Â over 900 million users worldwide. And Twitter started much later and by

Â 2012, got over 500 million users worldwide.

Â And one estimate says that within a day, it processes over 250 million tweets.

Â 1:22

Now, Twitter, in particular, provides what we call a direction node link. Because you

Â may subscribe to my tweets but I don't subscribe to yours.

Â And indeed, Twitter's success builds on top of micro-blogging and group texting

Â and one-sided or one direction social networking.

Â 1:52

Now, given the popularity of these services, there is no shortage of

Â interesting questions people have raised. One question is, how do I quantify the

Â statistical properties of opinions on these platforms?

Â For example, a recent study were carried about Tweets of Oscar nominated movies

Â 2:50

And, statistical properties of the tweets alone did not provide insightful

Â prediction of either the box office success, or the critical claim However, if

Â it is combined with some of these other online opinion venues,

Â Then it does enhance the predicted the prediction accuracy.

Â 3:49

And other kind of blocks into a numerical score.

Â Now, how do they measure that? There are variety of possibilities. For example, you

Â can measure on Twitter the number of re-tweets of your tweets, okay? As we

Â identify through a party of via. We can also look at the number of

Â re-posting of URL's that you posted. If you have access to the topology of who

Â follows whom, For example, I follow your tweets, and you

Â follow Anna's tweets, and so on. Then, this typology can also give us some

Â information about influential power of different nodes which are different

Â Twitter users. Now, what kind of numerical scores is

Â meaningful enough to be used? And that author remains a question mark.

Â And that relates to the third question, How do I leverage the above knowledge of

Â the either aggregate statistical properties of Facebook and Twitter users

Â properties, or individual influential power source and turn these into a viral

Â marketing campaigns. For example, maybe I'll be able to seed

Â some users and ask them to post or tweet certain information about my product or

Â service. And the question is, which nodes to seed?

Â 5:38

If I know the topology, would it help me to answer this question more precisely?

Â So, these three questions are indeed important.

Â And, as we will see, there is a significant gap between the theory and

Â practice. Just like last lecture, this lecture

Â offers the biggest gap between theory and practice in networking.

Â 6:01

And these questions are not entirely new either, the question of influential power

Â of individual users that many interesting historical anecdotes.

Â For example, the famous Revere-Dawes Ride that happened over the night of April

Â eighteenth to nineteenth in 1775. This was the ride that helped alert the

Â American forces to fight against the British military forces.

Â Two individuals, Paul Revere and William Dawes, they both liked to alert the local

Â American forces about the imminent British action.

Â 7:13

And it turns out that the path took by Revere went through a lot more influential

Â people in the neighborhood and was much more effective in alerting the local

Â militia leaders to prepare for the British military action which led to the success

Â of the first battle of the American Revolutionary War.

Â So, how can we quantify that a certain path pass through more influential nodes

Â than other paths? Another very famous anecdote is this

Â graph. This graph is so called the Padgett's

Â Florentine family graph which shows fifteen prominent families in Renaissance

Â period of Florence, Italy. A few of these fifteen families aren't

Â shown here with their names. The Medici family was widely viewed as the

Â most influential among the fifteen families.

Â If you look at this graph where each node is a family, and a link is a bidirectional

Â link, representing some kind of kinship or marriage relationship.

Â Then, how can we make intuitive understanding of this belief that the

Â Medici family was by far the most influential?

Â Well, if you look at a naive view, for example, how many links does each family

Â have? Then, the Medici family had six links.

Â 8:56

That's indeed the largest number. It's got the largest degree in this graph

Â of family relations. But there are also other families such as

Â the Strozzi family and the Guadagni families.

Â They also got each four neighbors, Okay? So, the degree count is not that

Â much of a difference. And this is 50% higher than four, but was

Â widely believed that the Medici family was a much more influential than a mere 50%

Â differential. So, how can we quantify this intuitive

Â understanding of Medici's family, being more important than just 50%?.

Â There's many other stories, For example, Granovetter famous social

Â network theorist and empiricist. He did a 1973 survey in Amherst,

Â Massachusetts. And he discovered what he called the

Â strength of weak ties. Turns out that, if you look at the graph

Â of people where each node is a person and a link is some kind of family or

Â friendship, relationship. Then, he showed that certain links which

Â might be viewed as weak, For example, it goes all the way to

Â another node. But there are very few of the past to

Â connect this node to that other node as opposed to these nodes which are very

Â closely knit together. Later, we will quantify the notion of a

Â group in the advanced material part of the lecture,

Â How do we quantify closely knit together idea.

Â But, intuitively and visually, you can see that these nodes are very close together,

Â okay? And these links reach out to some far out

Â nodes that are not tightly knit. And these are sometimes called the weak

Â links. And yet, they present a high strength in

Â disseminating information such as, employment opportunities.

Â 11:27

In order to answer all these questions, historical ones, the current ones, we have

Â to, represent our important quantities with some symbols.

Â So, we start with, of course, graphs. This is already Lecture eight.

Â So, in the past seven lectures, we have seen quite a bit of graphs.

Â You may still remember the graph of interference channel.

Â You may remember the graph that represents the auction, the bipartite graph of

Â matching. And now, we're going to talk about graphs

Â representing the topology of a network in general terms.

Â And then, we will use matrices to algebraically represent these graphs, so

Â that we can start running some mathematical operations.

Â Well, in general, we say a graph, G, is a data structure that consists of two

Â topples. First, is a set of nodes or vertices.

Â It's the set V. And then, it's the set of links or edges,

Â set E. Each element in this set say, indexed by

Â I, is simply one node. And then, each element in this set is

Â basically a two toe, i, j, represents a link from node i to node j.

Â 13:01

If we say that, i, j is not equal to j, I, then we call this graph a directed

Â graph, okay? Because each link has a direction.

Â Node i pointing to j doesn't mean that j points to i, okay?

Â Whereas, if every link is actually both i to j and j to i, then we just skip writing

Â these arrows on the line. And with implicit understanding that this

Â is a by directional link and therefore is a graph without directional links.

Â 13:46

In all the graphs that we'll be looking at, we look at the so called simple and

Â connected graphs. Simple, meaning that each link just

Â connects to two nodes. And connected means that every node is

Â connected to every other node somehow, someway.

Â Maybe through a very long path, But nonetheless, they're all connected.

Â We don't have a disconnected node. Like this.

Â Although, in homework problem, we'll be, be looking at generalization into what's

Â called the hyper graphs. Now, we're going to use some matrices to

Â describe these graph's connectivity patterns.

Â We'll first look at what's called the adjacency matrix represented as a bold

Â face A. This is a matrix of dimension N by N,

Â where N is the number of nodes. And this matrix is very easily defined.

Â We say the ij-th entry of this matrix is one if there is a link from node i to node

Â j, and zero otherwise. For example, if I have the following

Â 15:17

graph, okay? So, very simple graph, Nodes one, two, three, four.

Â Then, the four by four matrix, A, is the following entries,

Â Okay? One is connected to three. So the first row, we'll say zero, cuz

Â there's no self loop. And then one, one, and then zero cuz one

Â is not connected to the four. And then, the second row, we'll say, is

Â connected to node one and three, not four. Three is connect to one, two and four,

Â And four is only connected to three, Okay? So, this is the matrix A for this

Â graph. As you can see, the diagonal is all zero

Â because there's no self loops, A node does not point to itself.

Â And it's a symmetric matrix meaning, Aij = Aji which is really an algebraic way to

Â say that this is a graph with directional, bidirectional links.

Â 16:27

Now, if it is a direct to graph, the links have directions, then how do we define Aij

Â then? It's defined the same way, however, the

Â ij's may be one but ji's might be zero. For example, if we say one goes to two,

Â two goes to three, three goes to one and four goes to three,

Â Okay? Then, the new matrix would be the following,

Â Okay? One goes to two, Okay? And two goes to three,

Â 17:05

Three goes to one, And four goes to three.

Â Now, as you can see that Aij is not equal to Aji, for example,

Â This is one but this entry is zero. A21 is not equal to A12.

Â 17:28

The second matrix that we actually use much less in this course is called

Â incidence matrix. This is a matrix we represent by A hat of

Â dimension N by L. N, as before, is the number of nodes

Â whereas L is now the number of links here. So, how do we define the entries of this

Â matrix We say A hat ij-th entry equals one if node i is on link j, and zero

Â otherwise. So, it's a binary matrix where it's one if

Â there's an incidence of node i and if there's incidence of link j on node i,

Â Okay? For example, in the last slide we look at this example,

Â One, two, three four. Now the matrix A hat will look like the

Â following. We first have to write down also the

Â labels for these links, okay? Let's write down these links as A, B, C

Â and D, Okay? So, these are the one, two, three,

Â four, these are A, B, C, D. So happens that still four by four biggest

Â number of links equals number of nodes in this particular example.

Â So, link A is incident on both nodes one and two.

Â Write down one, one here and zero, zero. And B's incident on both two and three.

Â C is incident on one and three, And D is incident on three and four.

Â 19:24

Now, what if it is a directed graph, okay? Saying the links are like this. Then, we

Â say, the entry is minus one, if nodes i, node i stops, terminates j link.

Â And it's one if node i starts link j, and zero if node i is not related, is not on

Â link j, Okay? In this particular example, we see

Â that node on link A goes from one to two. So, start at one ends at node two.

Â And link B starts from two ends at three. C from three to one it's actually from

Â 20:15

here, from three to one should be positive.

Â And, D is from four to three, okay? The starting point is positive one, the ending

Â point is negative one, and zero everywhere else.

Â As you can see, every column of incidence matrix of the rectagraph must have exactly

Â one, one, one minus one, and the rest are zero,

Â Precisely because all the graphs we're looking at are simple graphs.

Â But, for each row, we could have multiple plus ones and multiple minus ones because

Â each node might start or terminate multiple links.

Â