So in the last video I left you with a cliffhanger. I introduced you to the
minimum cut problem. I introduced you to a very simple algorithm, randomized
algorithm, in the form of contraction algorithm. We observed that sometimes it
finds the main cut and sometimes it doesn't. And so the $64000 question is
just how frequently does it succeed and just how frequently does it fail. So now
that I hope you've brushed up the conditional probability and independence,
we are gonna give a very precise answer to that question in this lecture. >>So
recalling this problem we are given as input in undirected graph, possibly with
parallel edges, and that the goal is to compute among the exponential number of
possible different cuts, that's with the fewest number of crossing edges. So, for
example in this graph here, which you've seen in a previous video, the goal is to
compute the cut (A, B). Here, cuz there are only two crossing edges, and that's as
small as it gets. That's the minimum cut problem and Karger proposed the following
random contraction algorithm based on random sampling, so we have N-2
iterations, and the number of vertices gets decremented by 1 in each iteration.
So we start with N vertices, we get down to 2. And how do we decrease the number
of vertices? We do it by contracting or fusing two vertices together. How do we
pick which pair of edges, which pair of vertices to fuse? Well we pick one of the
remaining edges, uniformly at random. So there's [inaudible] many edges there are
remaining. We pick each one, equally likely. What, if the endpoints of that
edge are (u, v), then we collapse u and v together into a single super node. So
that's what we mean by contracting two nodes into a single vertex and then if
that causes any self-loops, and as we saw the examples, we will in general have
self-loops, then we delete them before proceeding. After the N-2
generations, only two vertices remain. You'll recall that two vertices
naturally correspond to a cut. The first group of the cut A corresponds to the
vertices that were fused into one of the super vertices remaining at the end. The
other super vertex corresponds to the set B the other original vertices of the
graph. >>So the goal of this lec, of this video is to give an answer to the
following question: What is the probability of success? Where by success,
we mean outputs a particular min cut (A, B). So let's set up the basic notation. We're
gonna fix any with input graph, undirected graph. As usual we use N to denote the
number of vertices and M to denote the number of edges. We're also going to fix a
minimum cuts (A, B). If a graph has only one minimum cut, then it's clear what
I'm talking about here. If a graph has multiple minimum cuts, I'm actually
selecting just one of them. Because I'm gonna focus on a distinguished minimum cut (A, B),
and we're only gonna define the algorithm as successful if it outputs this
particular minimum cut (A, B). If it outputs some other minimum cut, we don't count it.
We don't count it as successful. Okay. So, we really want this distinguished minimum
cut (A, B). In addition to N and M, we're gonna have a parameter K, which is
the size of the minimum cut. That is, it's the number of crossing edges of a minimum
cut. For example, that cross (A, B). The K edges that cross the minimum cut (A, B); we're
going to call capital F. So the picture you wanna have in mind is, there is, out
there in the world, this minimum cut (A, B). There's lots of edges with both end points
in A, lots of edges possibly with both endpoints in B. But, there's not a whole
lot of edges with one endpoint in A and one in endpoint in B. So the edges F,
would be precisely, these three crossing edges here. >>So our next step is to get a
very clear understanding of exactly when the execution of the contraction algorithm
can go wrong, and exactly when it's gonna work, exactly when we're going to succeed.
So let me redraw the same picture from the previous slide. So given they were hoping
that the contraction algorithm outputs this cut (A, B) at the end of the day, what
could possibly go wrong? Well, to see what could go wrong, suppose,, at some iteration,
one of the edges in capital F, remember F are the edges crossing the min cut (A, B), so
it's these three magenta edges in the picture. Suppose at some iteration one of
the edges of F gets chosen for contraction. Well because this edge of F
has one endpoint in A and one endpoint in B, when it gets chosen for contraction, it
causes this node from A and this node from B to be fused together. What does that
mean? That means, in the cut that the contraction algorithm finally outputs, this
node from A and this node from B will be on the same side of the output cut. Okay,
so the cut output by the contraction algorithm will have on one side both the
node from A and the node from B. Therefore, the output of the contraction
algorithm if this happens will be a different cut than (A, B), okay? It will not
output (A, B) if some edge of F is contracted. >>And if you think about it, the converse is
also true. So let's assume now, that in each of the N-2 iterations, the contraction
algorithm never contracts an edge from capital F. Remember capital F are exactly
the edges with one endpoint in A and one endpoint in B. So if it never contracts
any edge of F, then it only contracts edges where both endpoints lie in capital
A or both endpoints lie in capital B. Well, if this is the case then, vertices
from A always stick together in the fused nodes, and vertices from B always stick
together in the fused nodes. There is never A iteration where a node from A and
a node from B are fused together. What does that mean? That means that when the
algorithm outputs <i>cuts</i> all of the nodes in A have been grouped together, all
of the nodes in B have been grouped together, in each of the two super nodes,
which means that the output of the algorithm is indeed the desired
cut (A, B). Summarizing, the contraction algorithm will do what we
want. It will succeed and output the cut (A, B), if and only if it never chooses an
edge from capital F for contraction. Therefore, the probability of success,
that is, the probability that the output is the distinguished min cut (A, B), is
exactly the probability that never contracts an edge of capital F. >>So, this
is what we're gonna be interested in here. This really is the object of our mathematical
analysis, the probability that in all of the N-2 iterations we never
contact an edge of capital F. So, to think about that, let's think about each
iteration in isolation, and actually define some events describing that. So for an
iteration I, let Si denote the event, that we screw up an iteration I. With this
notation, we can succinctly say what our goal is, so, to compute the probability of
success. What we wanna do is we wanna compute the probability that <i>none</i> of the
events, S1, S2 up to N minus, S(N-2) never occur. So, I'm gonna use this NOT(¬)
symbol to say that S1 does not happen. So we don't screw up in iteration 1, we
don't screw up in iteration 2, we don't screw up in iteration 3, and so on. All
the way up to, we don't screw up, we don't contract anything from capital F, in the
final iteration, either. So summarizing, analyzing the success probability of the
contraction algorithm boils down to analyzing the probability of this event,
the intersection of the NOT Sis with I ranging from iteration 1 to
iteration N-2. >>So we're gonna take this in baby steps, and the next quiz will
lead you through the first one, which is, let's have a more modest goal. Let's just
think about iteration 1. Let's try and understand, what's the chance we
screw up, what's the chance we don't screw up, just in the first iteration? So the
answer to this quiz is the second option. The probability is K/M, where K is the
number edges crossing the cut (A, B), and M is the total number of edges. And
that's just because the probability of S1, the probability we screw up, is just the
number of crossing edges. That's the number of outcomes which are bad, which
cause which trigger S1, divided by the number of edges. That's the total number
of things that could happen. And since all edges are equally likely, it just boils
down to this. And by the definition of our notation, this is exactly K/M. So this
gives us an exact calculation of the failure probability in the first
iteration, as a function of the number of crossing edges, and the number of overall
edges. Now, it turns out it's gonna be more useful for us to have a bound not
quite as exact, an inequality. That's in terms of the number of vertices N, rather
than the number of edges, M. The reason for that is, it's a little hard to
understand how the number of edges is changing in each iteration. It's certainly
going down by 1 in each iteration, because we contract that in each iteration, but it
might go down by more than 1 when we delete self-loops. By contrast the number
of vertices is this very steady obvious process. One less vertex with each
successive iteration. >>So, let's rewrite this bound in terms of the number of
vertices N. To do that in a useful way, we make the following key observation. I
claim that, in the original graph G, we are given as input, every vertex has at
least K edges incident on it, that is in graph theoretic terminology, every edge
has degree at least K. Where, recall, K is the number of edges crossing our favorite
min cut (A, B). So why is that true? Why must every vertex have a decent number of
neighbors, a decent number of edges incident to it. Well, it's because, if you
think about it, each vertex defines a cut by itself. Remember, a cut is just any
grouping into other vertices into two groups, that are not empty, that don't
overlap. So one cut is to take a single vertex, and make that the first group, A,
and take the other N-1 vertices, and make that the second group, capital B.
So how many edges cross this cut? Well, it's exactly the edges that are incident
on the first note, on the note on the left side. So every single cut, fall
exponentially many cuts, have at least K crossing edges, then certainly the
N cuts defined by single vertices have at least K crossing edges, so therefore, the
degree of a vertex is at least K. So our assumption that every single cut in
the graph has at least K crossing edges because it's a lower bound on the number
edges incident on each possible vertex. >>So, why is that usual? Well let's recall
the following general facts about any graph; which is that if you sum up over the degrees of
the nodes, so if you go node by node, look at how many edges are insident on that
node, that's the degree of V, and then sum them up over all vertices. What will you
get? You'll get exactly twice the number of edges, okay? So this is true for any
undirected graph, that the sum of the degrees of the vertices is exactly double-
the number of edges. To see this, you might think about taking a graph, starting
with the empty set of edges, and then adding the edges of the graph one at a
time. Each time you add a new edge to a graph, obviously the number of edges goes
up by 1, and the degree of each of the endpoints of that edge also go up by 1,
and there are, of course, two endpoints. So every time you add an edge, the number
of edges goes up by 1, the sum of those degrees goes up by 2. Therefore, when
you've added all the edges, the sum of the degrees is double the number of edges that
you've added. That's why this is true. Now, in this graph, at that we have a hand here,
every degree is at least K, and there's N nodes. So this left hand side, of course,
is at least KN for us. So therefore if we just divide through by 2, and flip the
inequality around, we have the number of edges has to be at least the size of the
crossing cut, so the degrees of every vertex times the number of vertices
divided by 2. So this is just the primitive inequality rearranging. Putting
this together with your answer on the quiz, since the probability of S1 is exactly K/M,
and M is at least KN/2, if we substitute, we get that the probability
of S1 is at worst 2/N, 2 over the number of vertices, and the K cancels out.
So that's, sort of, our first milestone. We've figured out the chance
that we screw up in the first iteration, that we pick some edge from the
crosses the cut (A, B). And things look good. This is a, this is a small number, right?
So, in general, the number of vertices might be quite big. And this says that
the probability we screw up is only 2 over the number of vertices. So, so far,
so good. Of course, this was only the first iteration. Who knows what happens
later? >>So now that we understand the chances of screwing up in the first
iteration, let's take our next baby step, and understand the probability that we
don't screw up in either of the first two iterations. That is, we're gonna be
interested. And the following probability. The probability that we don't screw up in
the first iteration nor in the second iteration. Now, as you go back to the
definition of a conditional probability, to realize that we can rewrite an
intersection like this in terms of conditional probabilities. Namely, as the
probability that we don't screw up in the second iteration, given that we didn't do
it already, times the probability that we didn't screw up in the first iteration.
Okay? So the probability that we miss all of these K vulnerable edges and in the second
iteration given that we didn't contract any of them in the first iteration. Now
notice this, we already have a good understanding on the previous slide. We are
given a nice lower bound of this. We say there's a good chance that we don't screw
up, probably at least 1-2/N. And in some sense we also have a
very good understanding of this probability. We know this is 1 minus the
chance that we do screw up. And what's the chance that we do screw up? Well, these K
edges are still hanging out in the graph. Remember we didn't contract any, in the
first iteration that's what's given. So there are K ways to screw up, and we choose
an edge to contract uniformly at random, so the total number of choices is the number
of remaining edges. >>Now the problem is, what's nice is we have an exact
understanding of this probability. This is an equality. The problem is we don't have
a good understanding of this denominator. How many remaining edges are there? We
have an upper bound on this. We know this is at most N-1, assuming we got
rid of one edge in the previous iteration, but actually what, if you think about it,
what we need of this quantity is a lower bound and that's a little unclear because
in addition to contracting the one edge and getting that out of the graph, we
might have created a bunch of self loops and deleted all events. So it's hard to
understand exactly what this quantity is. So instead we're gonna rewrite this bound
in terms of the numbers of the remaining vertices, and of course we know it's
exactly N-1 vertices remaining. We took two of the last iterations and
contracted down to 1. So how do we relate the number of edges to the number of
vertices? Well we do it just in exactly the same way as in the first iteration. We'll
make some more general observation. In the first iteration, we observed that every
node in the original graph induces a cut. Okay, with that node was on one side, the
other N-1 edges were on the other side. But the fact that's a more general
statement, even after we've done a bunch of contractions, any single node in the
contracted graph, even if it represents a union of a bunch of nodes in the original
graph, we can still think of that as a cut in the original graph. Right? So if
there's some super node in the contracted graph, which is the result of fusing
twelve different things together, that corresponds to a cut where those twelve
nodes in the original graph are on the one side A, and the other N-12
vertices are on the other side of the cut, B. So, even after contractions, as long as
we have at least two nodes in our contracted graph, you can take any node
and think of it as half of a cut, one side of a cut in the original graph.