So in the last video I left you with a cliffhanger. I introduced you to the

minimum cut problem. I introduced you to a very simple algorithm, randomized

algorithm, in the form of contraction algorithm. We observed that sometimes it

finds the main cut and sometimes it doesn't. And so the $64000 question is

just how frequently does it succeed and just how frequently does it fail. So now

that I hope you've brushed up the conditional probability and independence,

we are gonna give a very precise answer to that question in this lecture. >>So

recalling this problem we are given as input in undirected graph, possibly with

parallel edges, and that the goal is to compute among the exponential number of

possible different cuts, that's with the fewest number of crossing edges. So, for

example in this graph here, which you've seen in a previous video, the goal is to

compute the cut (A, B). Here, cuz there are only two crossing edges, and that's as

small as it gets. That's the minimum cut problem and Karger proposed the following

random contraction algorithm based on random sampling, so we have N-2

iterations, and the number of vertices gets decremented by 1 in each iteration.

So we start with N vertices, we get down to 2. And how do we decrease the number

of vertices? We do it by contracting or fusing two vertices together. How do we

pick which pair of edges, which pair of vertices to fuse? Well we pick one of the

remaining edges, uniformly at random. So there's [inaudible] many edges there are

remaining. We pick each one, equally likely. What, if the endpoints of that

edge are (u, v), then we collapse u and v together into a single super node. So

that's what we mean by contracting two nodes into a single vertex and then if

that causes any self-loops, and as we saw the examples, we will in general have

self-loops, then we delete them before proceeding. After the N-2

generations, only two vertices remain. You'll recall that two vertices

naturally correspond to a cut. The first group of the cut A corresponds to the

vertices that were fused into one of the super vertices remaining at the end. The

other super vertex corresponds to the set B the other original vertices of the

graph. >>So the goal of this lec, of this video is to give an answer to the

following question: What is the probability of success? Where by success,

we mean outputs a particular min cut (A, B). So let's set up the basic notation. We're

gonna fix any with input graph, undirected graph. As usual we use N to denote the

number of vertices and M to denote the number of edges. We're also going to fix a

minimum cuts (A, B). If a graph has only one minimum cut, then it's clear what

I'm talking about here. If a graph has multiple minimum cuts, I'm actually

selecting just one of them. Because I'm gonna focus on a distinguished minimum cut (A, B),

and we're only gonna define the algorithm as successful if it outputs this

particular minimum cut (A, B). If it outputs some other minimum cut, we don't count it.

We don't count it as successful. Okay. So, we really want this distinguished minimum

cut (A, B). In addition to N and M, we're gonna have a parameter K, which is

the size of the minimum cut. That is, it's the number of crossing edges of a minimum

cut. For example, that cross (A, B). The K edges that cross the minimum cut (A, B); we're

going to call capital F. So the picture you wanna have in mind is, there is, out

there in the world, this minimum cut (A, B). There's lots of edges with both end points

in A, lots of edges possibly with both endpoints in B. But, there's not a whole

lot of edges with one endpoint in A and one in endpoint in B. So the edges F,

would be precisely, these three crossing edges here. >>So our next step is to get a

very clear understanding of exactly when the execution of the contraction algorithm

can go wrong, and exactly when it's gonna work, exactly when we're going to succeed.

So let me redraw the same picture from the previous slide. So given they were hoping

that the contraction algorithm outputs this cut (A, B) at the end of the day, what

could possibly go wrong? Well, to see what could go wrong, suppose,, at some iteration,

one of the edges in capital F, remember F are the edges crossing the min cut (A, B), so

it's these three magenta edges in the picture. Suppose at some iteration one of

the edges of F gets chosen for contraction. Well because this edge of F

has one endpoint in A and one endpoint in B, when it gets chosen for contraction, it

causes this node from A and this node from B to be fused together. What does that

mean? That means, in the cut that the contraction algorithm finally outputs, this

node from A and this node from B will be on the same side of the output cut. Okay,

so the cut output by the contraction algorithm will have on one side both the

node from A and the node from B. Therefore, the output of the contraction

algorithm if this happens will be a different cut than (A, B), okay? It will not

output (A, B) if some edge of F is contracted. >>And if you think about it, the converse is

also true. So let's assume now, that in each of the N-2 iterations, the contraction

algorithm never contracts an edge from capital F. Remember capital F are exactly

the edges with one endpoint in A and one endpoint in B. So if it never contracts

any edge of F, then it only contracts edges where both endpoints lie in capital

A or both endpoints lie in capital B. Well, if this is the case then, vertices

from A always stick together in the fused nodes, and vertices from B always stick

together in the fused nodes. There is never A iteration where a node from A and

a node from B are fused together. What does that mean? That means that when the

algorithm outputs <i>cuts</i> all of the nodes in A have been grouped together, all

of the nodes in B have been grouped together, in each of the two super nodes,

which means that the output of the algorithm is indeed the desired

cut (A, B). Summarizing, the contraction algorithm will do what we

want. It will succeed and output the cut (A, B), if and only if it never chooses an

edge from capital F for contraction. Therefore, the probability of success,

that is, the probability that the output is the distinguished min cut (A, B), is

exactly the probability that never contracts an edge of capital F. >>So, this

is what we're gonna be interested in here. This really is the object of our mathematical

analysis, the probability that in all of the N-2 iterations we never

contact an edge of capital F. So, to think about that, let's think about each

iteration in isolation, and actually define some events describing that. So for an

iteration I, let Si denote the event, that we screw up an iteration I. With this

notation, we can succinctly say what our goal is, so, to compute the probability of

success. What we wanna do is we wanna compute the probability that <i>none</i> of the

events, S1, S2 up to N minus, S(N-2) never occur. So, I'm gonna use this NOT(¬)

symbol to say that S1 does not happen. So we don't screw up in iteration 1, we

don't screw up in iteration 2, we don't screw up in iteration 3, and so on. All

the way up to, we don't screw up, we don't contract anything from capital F, in the

final iteration, either. So summarizing, analyzing the success probability of the

contraction algorithm boils down to analyzing the probability of this event,

the intersection of the NOT Sis with I ranging from iteration 1 to

iteration N-2. >>So we're gonna take this in baby steps, and the next quiz will

lead you through the first one, which is, let's have a more modest goal. Let's just

think about iteration 1. Let's try and understand, what's the chance we

screw up, what's the chance we don't screw up, just in the first iteration? So the

answer to this quiz is the second option. The probability is K/M, where K is the

number edges crossing the cut (A, B), and M is the total number of edges. And

that's just because the probability of S1, the probability we screw up, is just the

number of crossing edges. That's the number of outcomes which are bad, which

cause which trigger S1, divided by the number of edges. That's the total number

of things that could happen. And since all edges are equally likely, it just boils

down to this. And by the definition of our notation, this is exactly K/M. So this

gives us an exact calculation of the failure probability in the first

iteration, as a function of the number of crossing edges, and the number of overall

edges. Now, it turns out it's gonna be more useful for us to have a bound not

quite as exact, an inequality. That's in terms of the number of vertices N, rather

than the number of edges, M. The reason for that is, it's a little hard to

understand how the number of edges is changing in each iteration. It's certainly

going down by 1 in each iteration, because we contract that in each iteration, but it

might go down by more than 1 when we delete self-loops. By contrast the number

of vertices is this very steady obvious process. One less vertex with each

successive iteration. >>So, let's rewrite this bound in terms of the number of

vertices N. To do that in a useful way, we make the following key observation. I

claim that, in the original graph G, we are given as input, every vertex has at

least K edges incident on it, that is in graph theoretic terminology, every edge

has degree at least K. Where, recall, K is the number of edges crossing our favorite

min cut (A, B). So why is that true? Why must every vertex have a decent number of

neighbors, a decent number of edges incident to it. Well, it's because, if you

think about it, each vertex defines a cut by itself. Remember, a cut is just any

grouping into other vertices into two groups, that are not empty, that don't

overlap. So one cut is to take a single vertex, and make that the first group, A,

and take the other N-1 vertices, and make that the second group, capital B.

So how many edges cross this cut? Well, it's exactly the edges that are incident

on the first note, on the note on the left side. So every single cut, fall

exponentially many cuts, have at least K crossing edges, then certainly the

N cuts defined by single vertices have at least K crossing edges, so therefore, the

degree of a vertex is at least K. So our assumption that every single cut in

the graph has at least K crossing edges because it's a lower bound on the number

edges incident on each possible vertex. >>So, why is that usual? Well let's recall

the following general facts about any graph; which is that if you sum up over the degrees of

the nodes, so if you go node by node, look at how many edges are insident on that

node, that's the degree of V, and then sum them up over all vertices. What will you

get? You'll get exactly twice the number of edges, okay? So this is true for any

undirected graph, that the sum of the degrees of the vertices is exactly double-

the number of edges. To see this, you might think about taking a graph, starting

with the empty set of edges, and then adding the edges of the graph one at a

time. Each time you add a new edge to a graph, obviously the number of edges goes

up by 1, and the degree of each of the endpoints of that edge also go up by 1,

and there are, of course, two endpoints. So every time you add an edge, the number

of edges goes up by 1, the sum of those degrees goes up by 2. Therefore, when

you've added all the edges, the sum of the degrees is double the number of edges that

you've added. That's why this is true. Now, in this graph, at that we have a hand here,

every degree is at least K, and there's N nodes. So this left hand side, of course,

is at least KN for us. So therefore if we just divide through by 2, and flip the

inequality around, we have the number of edges has to be at least the size of the

crossing cut, so the degrees of every vertex times the number of vertices

divided by 2. So this is just the primitive inequality rearranging. Putting

this together with your answer on the quiz, since the probability of S1 is exactly K/M,

and M is at least KN/2, if we substitute, we get that the probability

of S1 is at worst 2/N, 2 over the number of vertices, and the K cancels out.

So that's, sort of, our first milestone. We've figured out the chance

that we screw up in the first iteration, that we pick some edge from the

crosses the cut (A, B). And things look good. This is a, this is a small number, right?

So, in general, the number of vertices might be quite big. And this says that

the probability we screw up is only 2 over the number of vertices. So, so far,

so good. Of course, this was only the first iteration. Who knows what happens

later? >>So now that we understand the chances of screwing up in the first

iteration, let's take our next baby step, and understand the probability that we

don't screw up in either of the first two iterations. That is, we're gonna be

interested. And the following probability. The probability that we don't screw up in

the first iteration nor in the second iteration. Now, as you go back to the

definition of a conditional probability, to realize that we can rewrite an

intersection like this in terms of conditional probabilities. Namely, as the

probability that we don't screw up in the second iteration, given that we didn't do

it already, times the probability that we didn't screw up in the first iteration.

Okay? So the probability that we miss all of these K vulnerable edges and in the second

iteration given that we didn't contract any of them in the first iteration. Now

notice this, we already have a good understanding on the previous slide. We are

given a nice lower bound of this. We say there's a good chance that we don't screw

up, probably at least 1-2/N. And in some sense we also have a

very good understanding of this probability. We know this is 1 minus the

chance that we do screw up. And what's the chance that we do screw up? Well, these K

edges are still hanging out in the graph. Remember we didn't contract any, in the

first iteration that's what's given. So there are K ways to screw up, and we choose

an edge to contract uniformly at random, so the total number of choices is the number

of remaining edges. >>Now the problem is, what's nice is we have an exact

understanding of this probability. This is an equality. The problem is we don't have

a good understanding of this denominator. How many remaining edges are there? We

have an upper bound on this. We know this is at most N-1, assuming we got

rid of one edge in the previous iteration, but actually what, if you think about it,

what we need of this quantity is a lower bound and that's a little unclear because

in addition to contracting the one edge and getting that out of the graph, we

might have created a bunch of self loops and deleted all events. So it's hard to

understand exactly what this quantity is. So instead we're gonna rewrite this bound

in terms of the numbers of the remaining vertices, and of course we know it's

exactly N-1 vertices remaining. We took two of the last iterations and

contracted down to 1. So how do we relate the number of edges to the number of

vertices? Well we do it just in exactly the same way as in the first iteration. We'll

make some more general observation. In the first iteration, we observed that every

node in the original graph induces a cut. Okay, with that node was on one side, the

other N-1 edges were on the other side. But the fact that's a more general

statement, even after we've done a bunch of contractions, any single node in the

contracted graph, even if it represents a union of a bunch of nodes in the original

graph, we can still think of that as a cut in the original graph. Right? So if

there's some super node in the contracted graph, which is the result of fusing

twelve different things together, that corresponds to a cut where those twelve

nodes in the original graph are on the one side A, and the other N-12

vertices are on the other side of the cut, B. So, even after contractions, as long as

we have at least two nodes in our contracted graph, you can take any node

and think of it as half of a cut, one side of a cut in the original graph.