The primary topics in this part of the specialization are: greedy algorithms (scheduling, minimum spanning trees, clustering, Huffman codes) and dynamic programming (knapsack, sequence alignment, optimal search trees).

Loading...

From the course by Stanford University

Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming

615 ratings

The primary topics in this part of the specialization are: greedy algorithms (scheduling, minimum spanning trees, clustering, Huffman codes) and dynamic programming (knapsack, sequence alignment, optimal search trees).

From the lesson

Week 1

Two motivating applications; selected review; introduction to greedy algorithms; a scheduling application; Prim's MST algorithm.

- Tim RoughgardenProfessor

Computer Science

So the third and final issue to think through is we need to make sure that we

pay the piper, that we keep these N variance maintained.

We know that if they're satisfied than we have this great way of finding the best

edge in each iteration, we just do an extractment But how do we make sure that

these N variance stay maintained throughout the algorithm?

So, to get a feel for the issues that arise in maintaining of the invariants,

in specific, invariant number two, and also to make sure we're all on the same

page with respect to the definition of key value of the vertices in the heap.

Let's go through an example. So in this example, I've drawn in the

picture a graph that has six vertices. in effect we've already run three

iterations of Prim's Algorithm, so four of the six vertices are already in

capital X, the remaining two vertices V and W are not yet an X, they're in V

minus X. So, for five of the edges, I've given

them a cost labeled in blue. For the other edges,

it's not relevant for this question what their edge costs are.

So you don't have to worry about it. So, the question is the following.

So given our semantics of how we define keys for vertices that are not in X, so

in this case the vertices V and W. What are their current key values

supposed to be? So those are the first two numbers I want

you to tell me. What's the current key value of V and W?

And then secondly, after we run one more iteration of Prim's algorithm.

Then what is the new key value of the vertex W supposed to be?

So the correct answer is the fourth one. Let's see why,

so first let's remember the semantics of keys.

What's the key supposed to be? It's supposed to be amongst, all the

edges. That on the one hand, are incidents to

the vertex. And on the other hand, are crossing the

cuts. It's the cheapest cost of any of those

edges. So, for the node V, there's four incident

edges with costs one, two, four, and five.

The one is not crossing the cut, the two, four, and five are crossing the cut.

The cheapest of those is two. So, that's why V's current key value is

two. For the node V, the node W, it has two

incident edges, a one and a ten. .

The one is not crossing the cut. The ten is.

It's the only candidate crossing the cut, so its key value is ten.

So the third part of the question says, what about when we execute one more

iteration of Prim's algorithm? So, what is Prim's algorithm going to do?

Well, it's going to move the edge with the smallest key from the right hand side

to the left hand side. V has a key value of two, w has a key

value of ten, so, V is going to be the one that gets moved from the right hand

side to the left hand side. So, once that happens, we now have a new

set capital X with a fifth vertex, V is now a member, so the new value of X is

everything except for the vertex W Now, the key point is that, as we've changed

the set capital X, the frontier has changed.

The current cut has changed. So of course, it's a different set of

edges that are crossing this new cut. Some have disappeared, and some are newly

crossing it. The ones that have disappeared are the

two and the four and the five. Anything between the vertex that got

moved that was already spanning, going to the left hand side has now been sucked

inside of capital X. On the other hand, the edge VW which was

previously buried internal to V minus X, with one of it's endpoints being pulled

to the left hand side. It is now crossing the cut.

So why do we care well the point is W's T value has now changed it use to have only

one incident edge crossing the cut the other across ten now with a new cut it

has two incident edges both the one and the ten are crossing the cut.

The cheapest of those two edges is of course the edges of cost one and that now

determines its key value its dropped from ten to one.

So the take away from this quiz is that well, on the one hand, having our heap

set up to maintain these two invariants is great, because a simple extract min

allows us to implement the previous brute force search in Prim's algorithm.

On the other hand, the extractions screws things up.

So it messes up the semantics of our key values.

We may, may need to recompute keys for the vertices.

So in this next slide I'm going to show you the piece of pseudocode you'd use to

recompute keys in the light of an evolving frontier.

Fortunately, restoring in varient number two after an extract min is not so

painful the reason being is that the damages done by an extract min are local.

More specifically, let's think about what are the edges that might be crossing the

cut now that were not previously crossing the cut?

Well the only vertex whose membership in these sets has changed is V so they have

to be edges that are incident to V. If the other end point was already in X

then we don't care this edge has just been sucked into X, we never have to

worry about it. But if the other end points, so if this

edge is incident to v if the other end point w is not an x.

Then with V being pulled over the, the left hand side.

Now this edge spans the frontier when previously it did not.

So the edges we care about are incident to V with the other N point outside of X.

And so our plan is just the obvious one, which is for each dangerous vertex.

Each vertex incident to V where the other endpoint W is not an X.

We just follow to the other endpoint W, and we just recompute its key, and we

just do that for all of the relevant W's. So that recomputation necessary is not

difficult, there's basically two cases. So this other end point W now it has one

extra candidate edge crossing the cut. Namely the one that's also incident on V.

The vertex that just moved. So I did this new edge VW is the cheapest

local candidate for W, or it's not. And we just take the smaller of those two

options. So that completes the high level

description of how you maintain invariants one and two throughout this

heap-based implementation of Prim's algorithms.

So each iteration, you do an extract min, from the extract min you run the

pseudocode to restore invariant number two, and you're good to go for the next

iteration. So for those of you who want not just the

conceptual understanding of this implementation, but really want to get

down to the any degree. You want to dot all the I's and cross all the T's. A

subtle point you might want to think through is how it is you implement this

deletion from a heap. The issue is, is deletion from a heap is

generally from a given position. And so here I'm only talking about

deleting a vertex from a heap, that doesn't quite tight check.

Really what you want to see is delete the vertex at position I from a heap.

So really pulling this off, the natural way to do it is have some

additional bookkeeping to remember which vertex is at which position in the heap.

So again, for the detail-oriented amongst you that's something to think through,

but this is the complete conceptual description of the algorithm.

Let's now move on to the final running time analysis.

So the first claim is that, the non-trivial work of this algorithm all

takes place via heap operations. That is, it suffices to just count the

number of heap operations, each of which we know is done in logarithmic time.

Okay, so let's count up all of the heap

operations. One thing we already talked about,

but I'll mention it here again for completeness is we do a bunch of inserts

just to initialize the heap in a pre-processing step.

So after we initialize, we move on to the main while loop.

Remember, there's exactly N minus one iterations of that while loop.

And in each one, we extract min exactly once.

So these were the easy steps. What you should be concerned about.

Are, the, heap operations, the deletions and re-insertions that are triggered by

needing to decrease the key avertices that are not in X.

Indeed, in a single iteration of Prim's algorithm, in a single move of a vertex

inside of capital X, can necessitate a large number of heap operations.

So, it's important to think, to count these operations in the right way, namely

in a edge-centric manner and the claim is that a single edge of the graph is only

going to trigger a single decrease key pair of.

Operations a single insertion deletion combo.

We can even pinpoint the moment in time at which we're going to have this inser,

this deletion and reinsertion. It's going to be when the first of the

endpoints, so either V or W, the first iteration at which one of those gets

sucked into the left-hand side capital X, that's going to trigger the

insert-delete, potentially for the other endpoint.

When the second endpoint gets sucked into the left-hand side, you don't care,

because the other endpoint has already been taken out of the heap,

there's no need to maintain its key. So that means that the number of heap

operations is almost twice the number of vertices plus almost twice the number of

edges. We're again going to use this fact that

the input graph is connected and therefore the number of edges is

asymptotically at least the number of vertices.

So we can say that the number of heap operations is at most a constant factor

times the number of edges, M. As we've discussed every heap operation

runs in time logarithmic in the number of objects in the heap so that's going to be

Log N in this case so we get an overall running time of O of M times Log N.

So this is now a quite impressive running time for the really quite non-trivial

minimum cost spanning tree problem. Of course we'd love to do even better.

If we could shave off the Log N factor and be linear in the input, that would be

even more awesome. But we gotta feel pretty good about this

running time. Right?

This is only a Log N factor slower than what it takes to read the input.

This is the same kind of running time we're getting for sorting.

So this actually puts the minimum spanning tree problem into the class of

four free primitives. If you have a graph and it fits in the

main memory of your computer, this algorithm is so fast.

Maybe you don't even know why you care about the minimum spinning shaver graph.

Why not do it? It's basically cost-less.

That's how fast this algorithm is.

Coursera provides universal access to the world’s best education,
partnering with top universities and organizations to offer courses online.