0:02

Now, we're going to look at depth-first search, which is a classical graph

Â processing algorithm. It's actually maybe one of the oldest

Â algorithms that we study, surprisingly. One way to think about depth first search,

Â is in terms of mazes. It's a pretty familiar way to look at,

Â look at it. And, so, if you have a maze like the one

Â drawn on the left, you can model it with a graph.

Â By creating a vertex for every intersection.

Â In the maze. And an edge for every passage connecting

Â two intersections. And.

Â So. If you're at the entrance to this maze and

Â you want to find a pot of gold somewhere. What you're gonna need to do is.

Â Explore every intersection. Or explore, explore every edge.

Â In the maze. So.

Â We're gonna talk about the. Explore every intersection.

Â Option. So that's our goal.

Â To have an algorithm for doing that. By the way, this is a famous graph that

Â some of you might recognize. That's the graph for the Pac-man game.

Â Okay, so one method classic method that predates computers for exploring a maze is

Â called the Tr maux maze exploration algorithm.

Â The idea is to think about having a ball of string.

Â And what you do is when you walk down a passage, you unroll the string behind you.

Â And you also, mark, every place that you've been.

Â So actually, I have a ball of string and some chalk, maybe.

Â So in this case, maybe we walk down this passage here.

Â And now we have some choices about where we might go.

Â So say we go down here. So we unroll our ball of string and mark

Â it. And so now, the next time, At this

Â intersection, we have no choice but to go up here.

Â We go up here and we say, oh, we've already been there.

Â So we're not gonna go there. And we come back.

Â And, we have our ball of string. So we can unroll it to figure out where we

Â were. And we go back until we have some other

Â choice. Which is this, this place, now.

Â And we mark that we've been in these other places, And so now, we take another option

Â and say, go down this way. And here, we take another option, go that

Â way. And then finally again we go up this a way

Â and we see that we've been there so we back up and take the last option and then

Â that gets us to the last vertex in the graph.

Â So mark each visited intersection and each visited package, passage, and retrace our

Â steps when there's no unvisited option. Again this is a classical algorithm that

Â was studied centuries ago and in fact some argued the first youth was when Theseus

Â entered the Labyrinth and was trying to find the Minotaur and, And Rodney didn't

Â want ''em to get lost in the maze. So she instructed Theseus to use a ball of

Â string to find his way back out. That's the basic algorithm that we're

Â gonna use. And has been studied by many, many

Â scientists in the time since Theses. And in fact, Claude Shannon, founder of

Â Information Theory, did experiments on mazes with mice to see if they might

Â understand maze exploration, this might help.

Â Okay, so here's what it look like in its typical maze.

Â 3:57

Now one of the things to remember is in a computer representation normally we're

Â just looking at the vertices and the set of associated edges.

Â We don't see anything other than that. Though, it's sometimes frustrating

Â watching me you know that it turned the wrong way and it's gonna get trapped here.

Â But, the computer doesn't really know that, so it has to back up along here now

Â and it continues to back up to find another option untill it gets free again.

Â And finds a some place to go. Sometimes its very frustrating.

Â Its seems to be quite close to the goal like appear and it turns a wrong way.

Â So we an see its gonna take a long way but no way the program could really know that.

Â Again, all the programs we're working with is vertex instead of edges associated with

Â that vertex and there it finally get to the goal.

Â Here's a bigger one going faster. The king thing is not so much, its getting

Â lost and the key thing is not going anywhere twice.

Â And that's the whole thing. We have to have the string to know to go

Â back where we came from. And we have to be able to mark where we

Â have been. And with those two things we are,

Â algorithm is, able to avoid going the same place twice.

Â If you weren't marking, if you tried to do this randomly or some other way it might

Â take you a while to get to the goal. So it doesn't seem like much of

Â accomplishment maybe for a maze but actually to be able to get there with

Â going, without going any place thrice, twice is sort of a, profound idea and

Â leads to an efficient algorithm. Okay.

Â So, our idea is, given in this, medicode, to do, depth research, that is, to, visit,

Â all the places you can get to from a vertex, V.

Â What we're gonna do is this simple re, recursive algorithm.

Â Mark the vertex as visited and then recursively visit all unmarked vertices, W

Â that are adjacent to V. That's a very simple description, and it

Â leads to very simple codes. It's so simple actually, it really belies

Â the profound idea underneath this algorithm.

Â So, again, There's lots of applications. And, for example, this is one way to find

Â whether there exists a path between two vertices.

Â Or to find all the vertices connected to a given source vertex.

Â And we'll consider some less abstract applications, once we've looked at the

Â code. So, so how to implement.

Â Well here's what we're gonna do for our design pattern for graph processing.

Â It's our first example. So what we did, when we defined an API for

Â graph was to decouple the graph data type from graph processing.

Â The idea is we're gonna create a graph object using that API which we know allows

Â us to represent a big graph within the computer.

Â And gives us the basic operations that we're gonna need for graph processing.

Â And then we use that API within a graph processing routine.

Â And the basic idea is that, that graph, graph processing routine will go through

Â the graph and collect some information. And then a client of that routine will

Â query the it's API to get information about the graph.

Â So in the case of, depth first search. Here's a potential possible API.

Â So the idea is that what this, what we're gonna implement is a program that can find

Â paths in a graph from a given source. So we give a graph and a vertex.

Â And that the constructor is gonna do what it needs in order to be able to answer,

Â these two queries. First one is, give a vertex,

Â Client will give a vertex. Is there a path in the graph, from the

Â source to that vertex? You wanna be able to, answer that

Â efficiently. And then,

Â The other thing is to just give the path. What's the path from, has to be giving all

Â the vertices on the path, in time proportional to its length.

Â So here's a client of, this, API. So, it's gonna take a source, a source

Â vertex S. And it's gonna build a pathfinder, or a

Â path object. And that object is gonna do the processing

Â it needs to be able to efficiently implement hasPathTo. And then what this

Â does is for every vertex in the graph. If there's a path from s to that vertex.

Â It'll print it out. So it prints out all the vertices

Â connected to x. And that's just one client, of this, data

Â type. You could, print out the pass or whatever

Â else you might. So that's our design pattern that we're

Â gonna use over and over again, for, A graph processing routine.

Â And it's important to understand why we use a design pattern like this.

Â We're decoupling the graph representation from the processing of it.

Â As I mentioned, there's hundreds of routines for, or algorithms that have been

Â developed for processing graphs. An alternative might be to put all of

Â those algorithms in one big data type. That's a so called fat interface.

Â And that would be a, a bad plan, cuz these things maybe are not so well related to

Â each other. And actually all of them really are just

Â iterating through the graph, and doing different types of processing.

Â With this way we're able to separate out. The, and I articulate what the graph

Â processing clients are doing. And then the real applications can be

Â clients, of these graph processing routines.

Â And everybody's taken advantage of an efficient representation, that we already

Â took care of. Okay.

Â So now let's look at a demo of how depth-first search is gonna work and then

Â we'll take a look at the implementation. Okay.

Â So here's a demo of depth-first search in operation on our sample graph.

Â Again, to visit a vertex we're gonna mark it, and then recursively visit all

Â unmarked verti-, vertices that are adjacent.

Â So this is a sample graph. And so the first thing we do is realize

Â that we're gonna need a vertex index array to keep track of which vertices are more.

Â So that will just be an array of bullions and we'll initialize that with all false.

Â We're also gonna keep another data structure.

Â A, a vertex indexed array of ints. That for every vertex gives us the vertex

Â that took us there. So, let's get started and you'll see how

Â it works. So this is depth-first search staring at

Â vertex zero. So now to visit vertex zero, we wanna mark

Â it so that's, our mark zero is true. That's the starting points we know

Â anything with Edge two. And now what we're gonna do is.

Â We're going to need to check all the vertices that are adjacent to zero.

Â So that's six, two, one, and five. The order in which they're checked depends

Â on the representations in the bag. We don'tr really, necessarily care about

Â that. Most of the algorithms are going to check

Â them all. And it doesn't matter that much about the

Â order. Although, in some cases it's wise to be

Â mindful. And maybe use a bag that takes them out in

Â random order. Okay this is zero.

Â Now we have to check, six, two, one, and five so, lets go ahead and do that.

Â So in this case six, six is the first thing to get checked.

Â And so now, we mark, six is visited so now we are going to recursively do a search

Â starting from six. The other difference when we visit six

Â from zero. We're going to put a zero in this edge to

Â entry to say that when we first got the six the way we got there, was from zero.

Â And that's going to be the data structure that'll help us, implement the client

Â query and give us the path back to zero from any path.

Â From any vertex. So okay what do we have to do to visit

Â six. Well six has two adjacent vertices zero

Â and four. So we're gonna have to check them.

Â So first we check zero and that's already marked.

Â So we don't really have to do anything. We're only suppose to recursively visit

Â unmarked vertices. And then we check four.

Â And four is unmarked, so we're going to have to recursively visit is.

Â The next thing we do is visit four. Mark four as having been visited.

Â Where the true and the marked array, Fourth entry is a marked array.

Â And we fill an edge two saying we got to four from six.

Â And so now to visit four, we have to recursively check five, six and three, and

Â again, that order is where they happen to be in our bag.

Â So first we check five. Five is not marked so we're going to visit

Â five. We're gonna mark it.

Â Say we got there from four, and then go ahead and visit three, four and zero, in

Â that order. From first we visit three.

Â That one also is not yet marked, so we're gonna recursively visit it.

Â So it's marked three. Say we got there from five and then go

Â ahead and to visit three recursively, we have to check five and four.

Â Check five. Well we just came here it's mark, so we

Â don't have to do anything. Check four, that's also, been marked so we

Â don't have to do anything. So now finally, this is the first time and

Â that requires a call that we're ready to return, we're done with that first search

Â from three. So now we're done with three.

Â And we can unwinding the recursion, we can now continue our search from five.

Â And the next thing we have to do from five, we had already checked three, so now

Â we're gonna check four. And we've already visited four, so we

Â don't have to do anything. That's already marked.

Â And we checked zero, and that one's already marked.

Â So now we're done with five, and we can back one more level up in the recursion.

Â So now, for four, we have to go through, and look at six and three.

Â Six is mar, marked, so we don't have to do anything.

Â Three is marked, so we don't have to do anything, and so we're gonna be done with

Â four. So that after finishing four we're done

Â with six. And so now we're in the recursion back at

Â zero. And we've already checked six.

Â So now we gotta check two next. We checked two, and so we rehearse and go

Â there. Mark two, and then say we got there from

Â zero, and now to visit two, all we check is zero and that's a marks, so we don't

Â have to do anything, and we're done with two.

Â Then check one, visit one, that's the last vertex we're visiting.

Â Check zero, it's already marked so we don't do anything.

Â We return. Now, we're at the last, step is to, from

Â zero, five is on it's list, we have to check if we've been there.

Â We can see that it's marked and we have been there so we're one with zero.

Â So that's a depth-first search from vertex zero, and we have visited all the vertices

Â that are reachable from zero. Number one and number two for each one of

Â those vertexes we kept track of how we got there from zero.

Â So if we now want to know for any one of those vertices how to get back to zero we

Â have the information that we need. For example say we want to find the path

Â from five back to zero. We know we got the five from four, we know

Â we got the four from six, we know we got the six from zero so we can go back

Â through using that edge to array to find. The path, so the depth for search

Â calculation built in data structures, and now clients, whose data structures built

Â in a constructor serve as the basis for, being able to.

Â Officially answer client queries. That's a depth-first search demo.

Â So, this is just a summary of the thing I talked about, during that demo.

Â Our goal is to find all the vertices connected to different vertex at.

Â And also a path, in order to be able to answer client query.

Â On the algorithm we're going to use is based on like maze exploration where we

Â use excursion, mark each vertex, keep track of the edge we took to visit it and

Â return when there's no unvisited options. We're using, two data structures, to

Â implement this. Both vertex indexed arrays.

Â One named Mark that will tell us which vertices we've been to.

Â And another one, edge two that maintains that tree of paths.

Â Where edge 2W = V means that VW was taken, the first time that we went to W.

Â So now, let's look at the code, that, given all of this background.

Â The code for implementing depth first search is remarkably compact.

Â So here's our private instance variables. The marked and edgedTo vertex and mix

Â arrays, and the source s. And the constructor just goes through and,

Â creates, the arrays and initializes them. And we won't repeat that code.

Â And so here's the, the last thing the constructor does after it creates the

Â arrays, is does a DFS on the graph, from the given source.

Â And it's a remarkably, compact implementation to do depth-first search,

Â from a vertex V. What we do is mark V, let's say mark it

Â true. Then for everybody adjacent to V.

Â We check if it's marked. If it's not marked, then we do a recursive

Â call. And we set edge to w equals v.

Â Again remarkably compact code that gets the job done.

Â 19:57

So now let's look at some of the properties of depth-first search.

Â So first thing is we wanna be sure that convince ourselves that it marks all the

Â vertices connected to S in time proportional to some of their degrees,

Â well, depth-first graph is going to be small.

Â So s-, so first thing is, convince yourself that if you mark the vertex, then

Â there has to be a way to get to that vertex from S.

Â So well that's easy to see, because the only way to mark vertex is get there

Â through a sequence of recursive calls, and every recursive calls corresponds to an

Â edge on a path from SVW. But you also have to be able to show that

Â you get to, every vertex that's connected to S.

Â And that's a little more intricate. And this diagram is, supposed to help you

Â out in understanding that. If you had, some unmarked vertex, Then,

Â maybe there's, a bunch of unmarked vertices.

Â And so... And it's connected to S and it's not

Â marked, so that means there has to be an edge on a path from S to W, that goes from

Â a marked vertex to an unmarked one. But the design of the algorithm says that

Â there's no such edge if you're on a marked vertex then you're gonna go through and

Â look at all the adjacent ones and if it's not marked, you're gonna mark it.

Â So that's an outline of the proof that DFS marks all the vertices.

Â And in the running time, it only visits each marked vertex once or each vertex

Â connected as once. And so, for each one of them, it goes

Â through all the adjacent vertices. So that's the basic properties of

Â depth-first search. So now, the other thing that is important

Â is that a client who has, uses this algorithm after the depth-first search,

Â after the constructor has done the depth-first search and built these data

Â structures, Client can find the vertices connected to

Â the source in constant time. And can find a path, 2S if one exists in

Â time proportional to its length. Well, the marked array provides the first

Â part. And the second part is Just the property

Â of the edge to array. It's a, what's called a parent link

Â representation of a tree rooted at S. So if a vertex is connected to S then its

Â edge two is parent in a tree. So this code here.

Â Is going to for a given, well has path too so that's just return mark, that's the

Â first part. And then to actually get the path to a

Â given vertex so, here's the code for doing that.

Â We actually use a stack to keep track of the path'cause we get it in reverse order.

Â If there's no path, we return null. Otherwise we keep a variable X and we just

Â follow up through the edge to array Pushing the vertex on to the stack and

Â then moving up the tree in the ray, then finally push, push as itself on to the

Â path and then we have a stack which is edible which will give us our path.

Â So that's in time, time proportional to the length of the path and forth while to

Â check your understanding of how stacks in real works, irreversible to take a look at

Â this code to see that it does the job. So that's depth-first search.

Â Now it's not. Of the optimal graph searching method for

Â all applications. And here's an, an amusing representation

Â of how depth first search can maybe create problems sometimes.

Â So, I'm getting ready for a date, what situations do I prepare for?

Â Well, medical emergency, dancing, food too expensive.

Â Okay, what kind of medical emergencies could happen?

Â Well, I could be snake bite or a lightning strike or a fall from a chair.

Â Well, what about snakes, I have to worry about corn snakes or garder.

Â Say for copperhead. And then well, I better make a straight.

Â I better study snakes. And then the date says, I'm here to pick

Â you up. You're not dressed.

Â And well, so I really need to stop using depth-first search.

Â So we're gonna look at other graph searching algorithms.

Â But if you always try to expand the next thing that you come to, that's depth-first

Â search. And there's a lot of natural, situations

Â where that naturally comes to mind. Here's another example.

Â I took this photo of the Taj Mahal a couple of years ago and I didn't like the

Â color of the sky. So I used Photoshop's magic wand to make

Â it more blue. And the implementation, now this is a huge

Â graph. This picture's got millions of pixels.

Â And the way that the flood filled the magic wand works, is to build, from a

Â photo, what's called a grid graph, where every vertex is a pixel and every edge

Â connects two pixels that are the same color, approximately the same color.

Â And it builds a blob of all the pixels that have the same color as the given

Â pixel. So when I click on one, it does the

Â depth-first search to find all. All the connected pixels and therefore

Â change them to the new color that's a fine example of depth per search on a huge

Â graph that people use everyday. So that's our first nontrivial graph

Â processing algorithm depth-first search.

Â