0:03
Next we'll look at another classic algorithm for computing the MST called
Prim's algorithm. It's also an extremely simple algorithm to
state. What we're going to do now is start with vertex zero and we're going to grow
the tree one edge at a time, always keeping it connected.
The way we're going to do that is add to the tree the minimum weight edge that has
exactly one endpoint in the tree computed so far, and we'll keep doing that until
we've grown the whole v-minus one edge tree.
Let's look at a demo to see how that works.
So we start with vertex zero. And we're supposed to add them in minimum
edge that's connected to zero. So that's zero, seven Out of all the edges
connected to zero, that's the one of minimum weight.
So now we have one edge, two vertices on the tree.
And so now we want to add the min weight edge that connects to the tree.
In this case, that's seven, one out of all the edges that connect to the tree, one,
seven is the shortest one, so that's the one that we add.
So now we have two edges on the tree. Continuing now, the min weight edge that
connects to the tree is zero, two so, we add that one.
So, now we have three edges, four vertices on the tree.
The closest edge, the closest vertex to the tree or the smallest edge coming out
of the tree is two, three so we add that one.
So now we have three more vertices to go, and you can see that the next one that's
going to come is five. That's closer, to the three than four or
six. So we do that, add five, now there's two
more and so, out of all those edges, the closest one to the tree is 45.
It's a little shorter than four, seven and zero, four.
So that's the one that gets added and then finally six gets added to the tree by the
surest edge that connects it to the tree, which is six, two.
So start with vertex zero, add an edge at a time to the tree, it's the shortest edge
that goes from, a tree vertex to a non-tree vertex, that's Prim's algorithm.
Now let's look at Prim's algorithm running on, on the same huge graph that we ran for
Kruskal's. This is also a fascinating dynamic
process. Usually, the new edge, is, close to, the
last edge added. But every once in a while, it gets stuck.
And so, jumps to a new place, to add edges, to the MST.
This algorithm's a little bit easier to follow.
But it's a very interesting dynamic process.
3:02
You can see that it, when it's easy, it just sticks where it was.
And when it runs into some long edges, it gets stuck and tries somewhere else.
Always adding to the tree, the shortest edge that connects a non-tree vertex to a
tree vertex. And you can see the last few things to be
added where the vertices in the upper left corner.
So that's a visualization of Prinz algorithm.
Completely different character, but comes out to the same tree as Kruskal's
algorithm as long as the edge weights are distinct.
So we need to prove Prim's algorithm correct and this one has been rediscovered
a, a few times depending on how you cast the data structure for implementing
finding the minimum. But the basic algorithm has been known
since at least 1930 and it's proof that it computes the MST again comes because it's
a special case of the greedy MST algorithm.
So the, let's suppose that E is the min-win weight edge connecting the vertex
on the tree to a vertex not on the tree. Well, you just, you take, as your cut, the
tree vertices. There's no black crossing edge from the
tree vertices to non tree vertex. That's a definition that's not on the
tree. And there's no crossing edge of low
weight, cuz that's the minimum one. That we picked by design.
So it's a special case of the Greedy algorithm where you take, as the cut, the
set of vertices currently on the tree. That's Prim's algorithm.
Now how are we going to implement Prim's algorithm?
How are we going to find the minimum weight edge with exactly one point and T?
Well, one thing that we could do is just try all the edges.
And, maybe some early implementations that would do that.
But what we're going to do is use a modern data structure of priority q.
So we're going to keep the, edges, on a priority queue.
But, have exactly one end point in T. And then we can just, pick out the minimum
weight one. That's a so called, lazy implementation of
Prim's algorithm. We'll look at another one, called the
eager implementation afterwards. So, what we need to do is find the minimum
weight edge with exactly one endpoint in the tree.
So, the solution is to make a priority cue of the edges that have at least one end
point in the tree. And then we, using as priority, the key is
the edge and the priority is the weight of the edge.
And so, we're going to. Use delete min to find the next edge to add to the tree.
And then we have to update the priority queue.
When we consider that edge, now there's going to be some edges on the priority
queue that are obsolete, and we've already found better ways to connect them, so
we'll just disregard an edge that has both endpoints in the tree, we've already found
a way to connect them and we don't need that edge for the minimum spanning frame.
That's why it's called a lazy implementation.
We allow stuff to be on the priority queue that we know is obsolete.
And then when we pull it off the queue we test whether it belongs in the tree or
not. But then the key step in the algorithm is
to assume, is to, what do you do when you get a new vertex for the minimum pan in
tree and a new edge. So that means that one of the vertices is
on the tree, let's say that's v and the other one is not on the tree.
That means W. And so what we want to do is add W to the
tree but then we also want to add to the priority Q, any edge that's incident to W.
So that's got the possibility, as long as this other end point is not in the tree.
So those edges have the possibility of being minimum spanning tree edges in the
future unless some better way to connect their incident vertex to the tree is found
before they come off the queue. And that's the algorithm.
A lazy solution of Primm's algorithm. So lets take a demo of that.
So what we're going to do is start with a vertex and really grow the tree.
Add to T the mid-weight edge with exactly one end-point entry in the tree.
That's Primm's algorithm. But now we're going to show the data
structure, the priority Q, that allows us to do this.
By keeping all the edges that we know about that connect, that possibly could be
that edge on priority Q. So let's look at what happens, for our
sample graph. So we start at vertex zero,
That's fine. Now, we're going to add that, to the tree,
vertex zero to the tree, and we're going to put in the priority queue, all the
edges that are, incident to zero. And just for, the demo, we'll just show
the, edges sorted by weight, with the understanding that we have a heaped data
structure or something under there, to give us the smallest one.
But, for the demo, it's easiest to see, them sorted by weight.
Okay so then to greenly grow the tree we have to pick 0,7 off the priority queue.
But and so we'll show that one on the MST. And then the vertex that's not, on the
tree at that point is seven, so we're going to add seven to the tree.
So first we add zero, seven to be an MST edge, and then we add to the priority
queue all the edges that are incident to seven, that. All the edges into, incident
to seven that point to places, the vertices that are not on the tree that are
connected to vertices that are not on the tree.
So we don't put zero, seven back on'cause zero is already on the tree.
So we put all those on a priority queue. And, again, keep them sorted by weight.
So, now let's continue. So smallest thing is one, seven.
That's the smallest edge, to a from a tree edge to a non-tree edge.
And so that's, delete 1,7 from the priority queue and add it to the MSTs, so
we do that. And now that takes us to one and so now we
have to add to the priority queue, all the edges that, connect, one to non-tree
edges. So that's what the asterisks are, are the
new, new edges on the priority queue. And again, we keep ''em sorted by weight.
So now what we want on the priority queue is a subset of the, you wanna be sure that
every edge that connects a tree edge to a non-tree edge is on the priority queue.
We might have a few others as well, and we'll see that in, in a second.
So now 0,2's the smallest so we take 0,2 and add it to the MST.
So notice now that once we add two to the MST, this edge between one and two becomes
obsolete. It's never going to be added to the MST.
At the time that we put one on, we thought maybe that was a good way to get to two.
But now we know there's a better way to get to two.
So that edge becomes obsolete. And the lazy implementation just leaves
that edge on the priority Q. So let's now continue, so, we added, two
to the, 0-2 to the MST. And we have to add, everybody infinite the
two, that, is not on the tree, to the priority queue.
So, in this case, it's 2-3 and 6-2. We don't have to add 1-2 and 2-7, 'cause
they go to three edges. We don't add'em back on.
Okay, so now the smallest is two, three. So take that, add it to the MST.
And add all the edges incident to three to non-tree vertices, in this case it's just
three, six. And that's a long one.
So now the next edge for the MST is five, seven.
Take that off the priority queue, put it on the MST.
And so and now all peak, all edges incident to five that go to non tree
vertices. It says just five, four and that one.
12:04
So now the next edge that would come off the priority que is one, three but, that's
an obsolete edge. We already have one, three connected in
the MST because we were lazy and left that in the priority que.
So now we just pull it off and, and discard it, because it connects through
three vertices. Same with 1,5.
We already have a better way to connect them.
2,7 connects through three vertices. And finally we get to four, five.
4,5 now gets deleted. And from the priority queue, and add it to
the MST. Everybody connected to four, that's just
six, and that's a long one, goes on. Now we have some obsolete edges we'll get
to that one too. And then four, seven is obsolete, and
zero, four is obsolete. And finally, the last edge to get added to
the MST is six, two. And after deleting 6-2 from the priority
queue and adding the MST we have, computed the MST.
We have V minus one edges on V vertices, and that's, implementation of the lazy
version of Prim's algorithm. And it's just a version of Prim's
algorithm we showed was the underlying data structure, the priority queue, that
ensures that we always get the shortest edge connecting a tree vertex to a
non-tree vertex. So let's look at the code for Prim's
algorithm. Again, the data structures that we build
up in part one of this course, give us a very compact implementation of this MST
algorithm. So we're going to need three instance
variables. One is a existence array.
A vertex indexed array of bullions, that for each vertex.
Will tell us whether or not it's on the MST.
Then we have the, list of edges on the MST, that, is going to be, returned to the
client after the MST is computed, to, for iterable.
And then we we'll have the priority queue of edges that, connect, tree vertices with
non-tree vertices. The super-set of the edges that connect
tree vertices and non-tree vertices. So.
Given a graph we'll build a priority queue.
We'll initialize all the data structures. And then we'll visit, and we'll show what
the routine does. That's the one that processes each vertex
when it gets added to, to the tree. We'll look at that in the next slide.
So the main loop is, while the priority q is not empty, we pull off the minimum edge
from the priority q. We get it's constituent vertices, if their
both marked then we just ignore it. They're already on the MST.
Otherwise, we put the edge on the MST. And which ever vertex was not on the tree,
we visit and put on the tree. And the visit routine is the one that puts
the vertex on the tree and puts all of its indecent edges on the priority Q.
So to visit a vertex we set its entry, corresponding entry in the marked array to
be true, so it's, added to the tree. And then for, every edge, that's adjacent
to that, we're going to, if it's, other edge is not marked, we're going to put it
on the priority Q. So if it's an edge that goes from a tree
vertex to a non-tree vertex, we put it on the priority Q.
And then, we have the, client query method to, get the MST, once the, MST is built.
Again, the data structures that we've used give a very compact and complete
implementation of Prim's, Prim's algorithm.
What's the running time of the algorithm? Well, it's correct cuz it implements
Prim's algorithm as we've showed us in the sense of a greedy algorithm.
And it's easy to see that the running time is always going to be proportional to E
log E in the worst case. And that's because you could put all the
edges on priority Q and. So, every edge would, might, would have to come off the
priority cue, so that's e times, and then the cost would be proportional to e for,
inserting and deleting, every edge off the priority cue.
So, E log e is sorry, is a fine running time.
The space, extra space proportional to e is you know, might be considered annoying,
or inelegant, or inefficient so there's a more efficient version of Prim's algorithm
where we clear off that dead weight on the priority queue.
And that's the eager implementation of Prim's algorithm that we'll look at next.
In practice, the lazy implementation works pretty well, but the eager implementation
is all. So a very elegant and efficient algorithm,
and it's worth learning both. So for the eager solution, what we're
going to do is. The priority Q is going to have vertices.
And it's going to have, at most, one entry per vertex.
And so those are the vertices that are not on the tree but are connected by an edge
to the tree. And the priority of a given vertex is
going to be the weight of the shortest edge connecting that vertex to the tree.
So if we look at this little example here Where we've build the tree for zero, seven
and one then the Black. Entries in this are the ones, the edges
that are on the MST. So that's zero, seven and one, seven and
the red ones are the ones that are on the priority Q because they're connected by an
edge to some vertex that's on the tree. And for each one of them, there's a
particular edge that connects, that's shortest, that connects that vertex to the
tree. So that's the key for the priority Q.
So that's what we're, that's what we're going to want for at any time during the
execution of the algorithm. We're going to want the vertices that are
connected to the tree by one vertex. And, and we're going to know the shortest
edge connecting that vertex to the tree. So then, the algorithm is again, delete
the minimum vertex. And it's got an associated edge that
connects it to the tree, and we put that one on the tree.
And then we have to update the priority queue.
So why do we have to update the priority queue.
So we have. This vertex that's not on the tree, we
consider all of the edges that are incident to that vertex.
If they point to a tree vertex then we're going to ignore it.
That's no problem. If it's not already on the priority Q,
we're going to put that new vertex on the priority Q.
And then other thing is, if the vertex is on the priority Q and we just found a
shorter way to get to it, then we're going to have to decrease the priority of that
vertex. So, let's look at how that works in a
demo. So, again, it's just an implementation of
Prim's algorithm. It's just how to we get the min weight
edge that connects to the tree? And this is a, a more efficient way to do
it. So, again, we start out with our graph,
started zero, and, let's get going. So, now, the priority QS vertices, and so
there's four vertices that are just one edge away from the tree, and, we keep'em
on the priority queue, in order of their distance to the tree.
So, and we also keep the edge. Two vertice, vertex index arrays.
One is the edge that takes us to the tree, and the other is the length of that edge.
And again we'll keep sorted on the priority queue just to make the demo
easier to follow. So the next vertex to go on the tree is
seven. The next edge to get added to the tree is
edge two of seven. And then we go from there.
So that's the smallest one, we take that for the tree and now we have to update the
priority q. So, how do we update the priority q?
Well we have to look at, everybody incident the seventh, and so, let's look
at, so, for example, 7-2. We don't need to put 7-2 in the priority
queue, since we already have a better way to connect two to the tree, thats 2-0.
So we don't have to change anything. Same with 7-4.
And about 7-5, and 7-1. One and five are not on the priority
queue. So, we have to put them on the priority
queue. And, then save the edges in length that,
get them, to seven which would get them to the tree.
So now, on our priority queue, we have our current tree.
And we have all vertices that were within one edge of the tree.
And the edge that gets'em to the tree, and their length.
So, we're ready for another step of the algorithm.
22:30
So now seventeen is the smallest thing in the priority queue.
And so we put that on the MST. And now we look at everybody connected to
one. And again, one seven we [inaudible],
because it's on the tree. One five we don't need cuz we have a
shorter way to get to the tree. One two we don't need cuz we have a
shorter way, to get, two to the tree, but we haven't seen three yet so we add vertex
three to the priority cue and say we get to the tree by one, one three, distance
point two nine. Every, all the vertices within one edge of
the tree and the edge and length of the shortest edge that gets.
To the tree from that vertex, that's what we're maintaining at every step.
Okay so next vertex to come to the tree is two and so we put that on the tree and now
we look at everybody that connected to two, so now we have our first example of
decrease key but let's, let's check them all.
So two, zero. Two, seven, and two, one, take us to
vertices that are already on the tree. So it's two, three and two, six.
So what we need to do for three, we had thought that the best way to get to three
from the tree from three was to go to one. And adding this new edge two means we now
know a better way to get from three to the tree.
So we have to update the priority. Update the edge two, in the priority we
have to decrease the key of the priority. So that's an operation that we're going to
need from our priority Q. And it's something that has to be factored
into our priority Q implementation. And the same thing for six.
We thought we had a good way to get to the tree from zero but two brings six closer
to the tree so we have to update that information.
And say now the best way to get from six to the three is 6,2 and that it's length.
We have to decrease the key. And this definitely.
Involves summary shuffling in the priority queue and our implementation is going to
take that into account. So, with those changes now, we have the
following situation. And, we've got, four edges on the tree.
Three edges on the tree. Now we're going to add the fourth, which
is 2-3. That's the smallest thing on the priority
queue. So we add, two, three to the MST.
And now we have to go to the, things connected to three, And, well, there's
nothing to add, since we already have a better way to six.
So next one that gets added is 5-7. And check, so, edges from five, seven.
So. We from five, We're going to decrease the
key of four from.38 to.5.'Cause the best way to get from four to the tree is no
longer 4,0, it's 4,5. So again, decrease the key and discard the
longer edge to the tree. And in fact, that's the next edge that we
pick. And we don't bother putting forth six on,
'cause we already have a better way to get from six to the tree.
And then the last edge that we had was 6,2.
Again, it's the easy version of Prim's algorithm is an implementation that always
connects to the tree, the vertex that's closest to the tree.
But we use a more efficient data structure to do it.
That can only have one, at most one entry per vertex, as opposed to one entry per
edge. So that's the eager version of the Prim's
algorithm. Okay rather than focus on the code for the
eager version. Which is quite similar to the code for the
lazy version. We're going to talk briefly about the Key
data structure that we need to implement this.
And it's the implementation of the priority Q that allows us to decrease
keys. And so, this is a advanced version of the
priority Q that we talked about in part one of the course.
But it's necessary for algorithms like this.
So what we're going to do, The problem is. We have keys that the priority queue
algorithm doesn't really, Needs to know when we change values of keys, and so we
have to do that through the API. And so what we're going to do is allow the
client to change the key by specifying the index and the, and the new key.
And then the implementation will take care of changing the value and updating its
data structures to reflect the changed values.
You can't have the client changing values without informing the implementation.
The priority queue implementation. That's the basic challenge for this data
structure. So since we are working with vertex
indexed arrays and graphs. And, the priority queue implementation
might do the same. We'll just associate an index.
Kind of pass that idea onto the priority, queue.
To make it allow it to implement these operations officially.
So the constructor gets to know how many indices or how many keys there are going
to be at most ever in the priority Q. And so it can make use of that to
implement efficient data structures for the operations.
And so insert just associates a key with a given index.
Decrease key allows to decrease the keys associated with a given index we can
check. Whether that's a bug should be Mth K.
Is an index on the priority Q. We can remove a minimal key and give it's
associated index. Check whether it's empty and give its
size. Now it's pretty much a topic for part one
of the course, but we'll give just one slide here to show how these indeces kind
of help things go around. We're basically going to use the same
code, the heap based implementation of min PQ. But we'll keep parallel arrays that
allows us to access quickly all the information that we need.
So things on the queue are accessed by index, and so we'll keep the values in
keys, so that's where the keys are. So PQ of I will give the index of the key
that's in heap position I and QPI is the heap position of the key with index I.
So that is the information that you need when the client changes a value, you.
Need to get to the, you have to actually first of all change the value but then you
may need to adjust the heap to, To reflect the changed value.
So instead of a swim with an index, we use the, we get the heat position of the given
index, and so forth. So if you look in the book, you'll see the
code for index priority Q, and it's definitely a confusing topic for a
lecture, but it's important to realize that it's possible to implement this
decreased key operation in logarithmic time without ever having to search
through. Everything for anything, using the idea of
indexing. So with that change, We get, for all the
operations, we get time proportional to, log v.
And, what do we have to do for, the eager version of Prim's Algorithm?
We, we have to, might have to insert every vertex once, and delete min, every vertex
one. And we might do as many as E decrease key
operations. So that gives us a total running time of E
log V. But the main thing is that the amount of
space for the priority q, is. Only V not E.
And that can make a difference for a huge graph.
So there are modifications that you can make to this.
That give a more efficient running time. For example, one easy thing to do is to
use since the operation we're performing most often is the decreased key.
If we use a multi-way heap, like say a four way heap, or in general a D way heap.
Then that reduces the cost of that operation.
And it's. Slightly increases the cost for insert and
delete min but there's not many of those, so we can get the running time to log base
e over v of v. And that, if you do the math for various
types of graphs, that's, that's going to be faster.
And in fact a data structure called the Fibonacci heap was invented in the'80s
that actually gets the running time down to e plus v log b, although that data
structure's too. Complimented to actually, too complicated
to actually use in practice. If you have a, a dense graph.
Then you wouldn't even use a heap. You'd just use an array and find the
minimum by going through, everything. Since every vertex has, lots of connected
vertices. So we didn't consider that one.
Because the huge graphs that we find in practice are, are sparse.
And a binary heap is going to be much faster.
And if you really have a performance-critical situation, it's
worthwhile to do a four way heap. That's the, basic bottom line in the
running time of Prim's algorithm.