0:00

But we're not done yet. There's still another problem.

Â For example, in this simple graph

Â with four web pages represented on four nodes and some links,

Â we can write down the age matrix.

Â It is, again, four by four,

Â and you should be getting better and better at this already.

Â Age matrix simply looks like this.

Â And you can easily see that depending on what is

Â my initial Pi vector I will end up with different important score vectors.

Â In other words, there are too many more than one consistent important score vectors.

Â Indeed, if I start with say initialization of 10000,

Â then I end up with the important score being half half 00.

Â However, if I start with initialization of zero, say 0.3, 0.7,

Â 0, then I end up looking at important score vector in the end as 0.15,

Â 0.15, 0.35, 0.35 for these four web pages.

Â So what we really want is to say that if you give me a matrix and I keep doing

Â the iteration Ï€_T=Ï€_T that whatever matrix might be on each hat,

Â starting from any initialization vector.

Â Pi_times_0 should be able to converge meaning

Â that after sufficiently number rounds of this then I

Â can be arbitrarily close to a particular fix the vector Ï€_star.

Â But this example shows that that may not be always the case.

Â So here is a possible solution and

Â this solution illustrates actually a very important theme,

Â randomization, that we'll be encountering quite a few times in

Â very different contexts in different networks of this course.

Â This is very first encountering of that randomization idea.

Â It says that if you're on a particular web page,

Â a node, in the graph here,

Â you may be hopping to

Â the other nodes that's hyperlinked from this node with evenly likely chance.

Â For example, if there two of them one half and one half.

Â But this whole behavior may be only Î˜% of the time you follow.

Â And 1-Î˜, where Î˜ is just some number between 0 and 100 in percentage points,

Â the other (1-Î˜)% of the time,

Â you actually will do a random flipping of the web pages.

Â So following the embedded outgoing links

Â is only part of the model of the navigation behavior by users on the web.

Â You represent, say, Î˜%.

Â And one minus Î˜%,

Â you will be doing a random picking.

Â It turns out that what theta you

Â pick determines quite a few things including the rate of convergence.

Â Smaller theta gives you faster convergence, in general,

Â but smaller Î˜ also means that you are pretty much

Â ignoring the connectivity pattern of the hyperlinked graph.

Â So that's not too surprising.

Â It turns out that we tend to pick 85% or 0.85 as

Â the right trade-off between relevance of

Â the connectivity pattern and the speed of convergence.

Â So how do we model back to this random flipping of web pages?

Â What we're saying is that here is a big matrix which is, really,

Â (1/N 1/N 1/N 1/N)... and (1/N 1/N)... and every single row is this 1/N.

Â Which we can write as basically a matrix of 11111111, everywhere is 1/N.

Â Which we can write as simply

Â 1/N_times_vector of one_out of product with a vector of ones.

Â Again, out of product may sounds a little weird to you,

Â but you can write down a simple example,

Â say, a two dimensional example.

Â Okay? One one is a column vector out to product with

Â one one so one by two times two by one times one by two,

Â you get a two by two matrix which is 1111.

Â So one out of product with one column then row

Â vector normalized by one over N is exactly what we want.

Â So the third matrix is simply one minus

Â Î˜ times normalization one,

Â and this matrix of all ones.

Â Vector 1 out of product with vector one.

Â And then don't forget that the original Î˜ percent of the time you will follow H

Â or better still the modified HH hat.

Â And this plus this is what we call the Google matrix represented as G.

Â Now, don't get confused with the GIJ back in lecture one in cellular network.

Â That can also give you a matrix G. There are interesting striking parallels

Â between the two but not stemming from the confusion of the symbols they use.

Â So this matrix G,

Â the Google Matrix, is the sum of two matrices.

Â We know this matrix is nice because is the sum in terms of H which is

Â sparse and a rank one matrix for the dangling nose.

Â And this obviously is a rank one matrix as well.

Â It's a very simple matrix.

Â So G is the sum of

Â a large and sparse matrix plus two large and rank one simple structured

Â matrix. So now we're ready to describe

Â the very famous pagerank algorithm behind

Â Google and one of the reasons for Google's success on the technical side.

Â It says that given this matrix G which is H

Â plus one over N times w vector times out to product with one vector.

Â The whole thing Î˜ percent of the time plus one minus Î˜,

Â the other percentage of the time is multiplying

Â by this randomization vector giving you an outer

Â product of matrix or once.

Â Okay? You define G as such and then you're going to iterate.

Â Pi transpose at the time K plus one is simply the last iterations PI transpose

Â of these vectors at iteration K. Multiply on the right not by the H matrix anymore,

Â we got two modifications,

Â we need it by Matrix G,

Â and let K keeps on going.

Â And you are now actually guaranteed to converge for any initialization Pi vector to

Â the equilibrium transpose because Pi star transpose times G matrix.

Â In other words, the important score is actually

Â the dominant left eigen vector of matrix G not just H.

Â We're skipping the proofs in linear algebra

Â but the end result says that with these two modifications we can

Â guarantee a convergence to a unique defined left eigen vector with eigen value one

Â relative to this matrix G. And this can be either computed

Â if G is small enough directly or iteratively through this procedure,

Â iterating over index, iterations indexed by K.

Â Now, if you still want to take the two more remaining steps then you'll be done.

Â One is you want to normalize.

Â Remember, so far we haven't normalized this Pi vector yet.

Â Okay? If you really want to know the relative scale you want to normalize them.

Â So the new Pi i is really the original Pi i divided by the sum of all the Pi's entries.

Â But the most important thing is actually ranking them.

Â Remember, we started with ranking not computing scores.

Â Computing score is a means to the ends of ranking and ranking says

Â then you can look at either the normalized or the not normalized version of Pi.

Â And then whichever node has the highest the Pi's ranked the highest,

Â then the second, then the third and keep on going.

Â And the top 10 will be going on to the first page of the search results.

Â And you can see that if rankings is what you care about,

Â actually you don't exactly see the relative scale.

Â For example, this might be way more important than 2 which is only

Â slightly more important than 3 by this computation of importance scores.

Â Pi. But you wouldn't know, okay,

Â Google doesn't visualize that scale and perhaps

Â that's a good idea because that scaling may not be that robust and,

Â hopefully the ranking itself is more robust.

Â So in summary, we have constructed three matrices for H to

Â H_hat to the whole Google matrix G. And

Â this iteration plus maybe a normalization and finally

Â our ranking is how roughly speaking Google displace the ranks of the web pages.

Â