0:00

So, in this set of lectures we'll be discussing the minimum cut problem and

Â graphs and we'll be discussing the randomized contraction algorithm.

Â Randomize algorithm which is so simple and elegant and it's almost impossible to

Â believe that it can possibly work but that's exactly at what we'll be approving.

Â So one way you can think about these set of lectures, as a segue of sorts,

Â between our discussion of randomization and our discussion of graphs.

Â So we just finished talking about randomization in the context of sorting

Â and searching.

Â We'll pick it up again toward the end of the class when we discuss hashing.

Â But while we're in the middle of randomization probability review,

Â I'm going to give you another application of randomization

Â in a totally different domain.

Â In particular to the domain of graphs, rather than to sorting and searching.

Â So that's one high level goal of these lectures.

Â The second one, is we'll get our feet wet talking about graphs, and

Â a lot of the next couple weeks,

Â that's what we're going to be talking about, fundamental graph primitives.

Â So this will give us an excuse to start warming up with the vocabulary,

Â some of the basic concepts of the graphs and what a graph algorithm looks like.

Â 0:55

Another perk, although it's not one of the main goals, but I want to do,

Â I do want to point this fact, compared to most of this stuff that we're discussing

Â in this class, this is a relatively recent algorithm, the contraction algorithm.

Â By relatively recent I mean, it's 20 years old.

Â 1:14

But at least that means most of us, I know not all of us, but

Â most of us at least were born at the time that this algorithm was invented.

Â And so just one quick digression.

Â In an intro course like this, most of what we're going to cover are oldies but

Â goodies, stuff from as much as 50 years ago.

Â And while it's kind of amazing, given how much the world and

Â how much technology has changed over the past 50 years,

Â that ideas in computer science from that long ago are still useful, they are.

Â It's just sort of an amazing thing about the stuff that

Â the first generation of computer scientists figured out.

Â It's still relevant to this day.

Â That said, algorithms is still a vibrant field with lots of open questions.

Â And when I have an opportunity, I'll try and give you glimpses of that fact.

Â So I do want to point out here that this is a somewhat more recent algorithm

Â than most of the other ones we'll see, which dates back to the 90s.

Â 2:49

And there's two flavors of graphs and both are really important.

Â Both come up all the time in applications, so you should be aware of both kinds.

Â So there's undirected graphs and directed graphs and

Â that just depends on whether the edges themselves are undirected or directed.

Â So edges can be undirected by which I mean this pair is unordered.

Â There are just two vertices of an edge the two endpoints, say U and V,

Â and you don't distinguish one as the first and one as the second.

Â 3:39

Now I think all of this is much clearer if I just draw some pictures.

Â Indeed one use to call graphs, dots and lines.

Â The dots would refer to the vertices, so here's four dots, or four vertices.

Â And the edges would be lines,

Â so the way you denote one of these edges is you just draw a line between the two

Â end points of that edge, the two vertices that it corresponds to.

Â So this is undirected graph with four vertices and five edges.

Â We can equally we'll have a directed version of this graph.

Â So let's still have four vertices and five edges, but

Â to indicate that this is directed graph and then each edge was first vertex and

Â the second vertex, were going to add arrows to the line.

Â So the arrow points to the second vertex, or to the head of the edge.

Â So, the first vertex is often called the tail of the edge.

Â So, graphs are completely fundamental,

Â they show up not just in computer science but in all kinds of different disciplines,

Â social sciences and biology being two prominent ones.

Â So, let me just mention a couple of reasons you might use them just off

Â the top of my head but literally there's hundreds or thousands of others, so

Â a very literal example would be road networks.

Â So imagine you type in asking for your driving directions from point A to point B

Â in some web application or software, or whatever, it computes a route for you.

Â What it's doing, is it's manipulating some representation of a road network,

Â which inevitably is going to be stored as a graph, where the vertices corresponds to

Â intersections and the edges correspond to individual roads.

Â The Web is often fruitfully thought of as a directed graph, so

Â here the vertices are the individual web pages, and edges correspond to hyperlinks.

Â So the first vertex in an edge detail is going to be the page that contains

Â the hyperlink.

Â The second vertex, or the head of the edge,

Â is going to be what the hyperlink points to.

Â So that's the Web as a directed graph.

Â Social networks are quite naturally represented as graphs.

Â So here the vertices correspond to the individuals in the social network.

Â And the edges correspond to relationships.

Â They have friendship links.

Â I encourage you to think about among the popular social networks these days,

Â which ones are undirected graphs and which ones are directed graphs,

Â we have some interesting examples of each of those..

Â 6:06

So to say what I mean, you might think about,

Â let's say you're a freshman in college and you're looking at your majors,

Â you're a science major and you want to know what courses to take in what order.

Â And you can think about the following graph where each of the courses in your

Â major corresponds to a vertex And you draw a directed edge from course A to course B,

Â if course A is a prerequisite for course B.

Â That is, it has to be completed before you begin course B.

Â Okay, so that's a way to represent dependencies, sort of a temporal ordering,

Â between pairs of objects using a directed graph.

Â 6:51

So, the definition of a cut of a graph is very simple, it's just a grouping,

Â a partition of the vertices of the graph into two groups, A and B, and

Â both of those two groups should be non-empty.

Â So, to describe this in pictures,

Â let me give you a cartoon of the cut in both the undirected and directed cases.

Â So for an undirected graph, you can imagine drawing your two sets, A and B.

Â And once you've defined the two sets A and

Â B, the edges then fall into one of three categories.

Â You've got edges with both of the endpoints in A.

Â 8:12

Usually when we talk about cuts,

Â we're going to be concerned with how many edges cross the given cuts.

Â And by that I mean the following, the crossing edges of a cut (A,B)

Â are those that satisfy the following property.

Â So in the undirected case,

Â it's exactly what you think it would be, one of the endpoints is an A,

Â the other endpoint is in B, that's what it means to cross the cut.

Â 8:34

Now in the directed case, there's a number of reasonable definitions you could

Â propose, about which edges crossed the cut.

Â Typically and in this course, we're going to focus on the case

Â where we only think about edges that cross the cut from the left to the right, and

Â we ignore edges which cross from the right to the left.

Â So that is the edges that cross the cut are those with tail in A and head in B.

Â So referring to our two pictures, our two corrections of cuts for the underrated

Â one all three of these blue edges would be the edges crossing the cut AB.

Â because they're the ones that have one end point on the left side and

Â one end point on the right side.

Â Now for the directed one, we only have two crossing edges.

Â 9:36

Recall what is the definition of a cut, it's just a way to group

Â the vertices into two sets A and B, both should also be not empty.

Â So we have N vertices and essentially we have one binary degree of freedom for

Â each, for each vertex, we can decide whether or not it goes in set A or

Â it goes in set B, so two choices for each of the N vertices, that gives us a two

Â to the N possible choices, two to the N possible cuts overall.

Â Now that's slightly incorrect because we call that a cut.

Â You can't have a non empty set A or a non empty set B, so

Â those are two of the two to the N options which are disallowed.

Â So strictly speaking the number is two to the N minus two, but

Â two to the N is certainly the closest answer of the four provided.

Â 10:17

Now, the minimum cut problem is exactly what you'd think it would be.

Â I give you as input a graph and among these exponentially, many cuts,

Â I want you to identify one for me with the fewest number of crossing edges.

Â So a few quick comments, so first of all the name for this cut is a min cut.

Â A min cut is one with the fewest number of crossing edges.

Â Secondly, to clarify, I am going to allow in the input what's called parallel edges.

Â There will be lots of applications where parallel edges are sort of pointless, but

Â for minimum cut actually it's natural to allow parallel edges.

Â And that means you have two edges that correspond to exactly the same

Â pair of vertices.

Â 11:00

Finally, the more seasoned programmers among you are probably wondering what I

Â mean by, you're given the graph as input.

Â You might be wondering about how exactly that's represented, so

Â the next video's going to discuss exactly that,

Â the popular ways of representing graphs and how you're usually going to do it in

Â this course, specifically via what's called an adjacency list.

Â 12:07

And I'll leave it for

Â you to check that there's no other edge that has as few as two edges.

Â Now in this case, we got a very balanced split when we took the minimum cut.

Â In general, that need not be true.

Â Sometimes even a single vertex can define the minimum cut of a graph, and

Â I encourage you to think about a concrete example that proves that.

Â So why should you care about computing the minimum cut?

Â Well, this is one problem among a genre called graph partitioning,

Â where you're given a graph and you want to break it into two or more pieces.

Â And these kinds of graph partitioning problem comes up all the time,

Â in a surprisingly diverse array of applications.

Â So let me just mention a couple at a high level.

Â So one very obvious one when your graph is representing its physical network,

Â when identifying something like a min cut allows you to do,

Â is identify weaknesses in your network.

Â Perhaps it's your own network, and

Â you want to understand where you soup of the infrastructure because it's,

Â in some sense, a hot spot of your network or a weak point.

Â Or, maybe there's someone else's network and

Â you want to know where the weak spot in their network.

Â In fact, there are some declassified documents about 15 years ago or so.

Â Which showed that the United States and Soviet Union militaries,

Â back during the Cold War, were actually quite interested in computing

Â minimum cuts, because they were looking for things like, for example, what's

Â the most efficient way to disrupt the other country's transportation network?

Â 13:37

So the question is among the huge graph,

Â say the graph of everybody who is on Facebook or something like that.

Â How can you identify small pockets of people that seem tightly knit,

Â that seem closely related,

Â from which you like to infer that there are community of some sort?

Â Maybe they all go to the same school, maybe they all have the same interest,

Â maybe they're part of the same biological family whatever.

Â Now, it's to some degree still an open question how to best define communities

Â and social networks.

Â But as a quick and dirty sort of first order heuristic,

Â you can imagine looking for small regions, which on the one hand, are highly

Â interconnected among themselves, but quite weakly connected to the rest of the graph.

Â 14:43

And there's a graph, which is very natural to define, given a 2D array of pixels.

Â Namely, you have an edge between two pixels if they are neighboring.

Â So for two pixels that are immediately next to each other left and

Â right or top to bottom, you put an edge there.

Â So that gives you what's called a grid graph.

Â And now unlike the basic minimum cut problem that we're talking about here,

Â in image segmentation it's most natural to use edge weights.

Â Where the weight of an edge is basically how likely you expect those two

Â pixels to be coming from a common object.

Â 15:14

Why might you're expect to enabling pixels to come from the same object,

Â well perhaps their color maps were almost exactly the same and

Â you just expected that they're part of the same thing.

Â So once you've defined the screen graph which suitable edge ways now you run

Â a graph partitioning or maybe cut type separate team, and the hope is that

Â the cut that it identifies rips off one of the contiguous objects in the picture.

Â And then you do that a few times and

Â you get the major objects in the given picture.

Â