0:00

In this video, and then in the next. We're going to develop, from scratch, a

Â algorithm for the all pair shortest path problem.

Â Rather than by relying on the aforementioned reduction to the single

Â source version of the problem. So, in this video, we're going to develop

Â the relevant optimal substructure lemma. In the next video, we'll culminate in the

Â dynamic programming algorithm. It is a famous one.

Â The Floyd Warshall algorithm. Before developing the optimal

Â substructure let me just tell you a little bit about where we're going.

Â The optimal sub-structure we're about to identify will lead to a dynamic

Â programming solution via the usual recipe, which we've now seen over, and

Â over, and over again. It's going to solve the all-pairs

Â shortest paths problem in O of N cubed time.

Â So we get a bound independent of the sparsity.

Â [SOUND] N cubed, even if the graph has negative edge lengths.

Â This algorithm is know as the Floyd-Warshall algorithm.

Â 0:50

Last video we discussed the performance that you get by running a single source,

Â shortest path sub routine N times. And the Floyd-Warshall algorithm is a

Â better solution in many situations. So first of all, for graphs that have

Â general edge links. That is edge links that are allowed to be

Â negative then Floyd-Warshall is looking pretty good.

Â Remember with the previous solution was to run the Bellman-Ford algorithm end

Â times. We can't use Dijkstra's algorithm cause

Â that's not generally correct with negative edge costs.

Â And if you run the Bellman-Ford algorithm in times you get a running time of N

Â squared M. So even in the best case of

Â Bellman-Ford's sparse graphs, we're doing just as well with Ford Warshall.

Â We've got a cubic running time. And we're doing quite a bit better for

Â dense graphs. There N times Bellman-Ford will give us N

Â to the fourth. A cubic running time whereas here we're

Â getting cubic. Now for graphs with non negative edge

Â costs, reducing to the single source problem is actually a pretty good

Â solution because Dijkstra is so fast. So if you're on Dijkstra N times, you're

Â running time is going to be N times M times log N.

Â So for sparse graphs you'd really want to use Dijkstra N times, rather than ford

Â warshaw. Cause you get a running time roughly

Â quadratic of N, as opposed to the cubic running time we're going to get here.

Â For dense graphs, it's not so clear. If you run dyxtra N times you're going to

Â get a running time which is ball park cubic, so that's going to be roughly the

Â same as Floyd-Warshall. And you'd expect them to be comparable.

Â It's not clear which would be better in practice.

Â If you. Had to choose between those two solutions

Â for dense graphs, you'd really want to code them both up and pick whichever one

Â seemed to do better in your domain. One place you see the Floyd-Warshall

Â algorithm used quite a bit is for computing the transitive closure of a

Â binary relation. In a graph theoretic language you can

Â think of a transitive closure problem as computing all pairs reachability.

Â So that's a special case of all pair shortest paths where you just want to

Â know whether the shortest path distance is finite or infinite.

Â For each pair of vertices you want to know does there exist a path from the one

Â vertex to the other or not. And if all you care about is the

Â transitive closure problem, if all you care about is one bit of information for

Â each pair of nodes, connected or not, as opposed to more generally what the

Â shortest path length is, then you can do some optimizations on the Floyd-Warshal

Â algorithm to speed up the constant factors.

Â But I'll leave it up to you to read up on the web about that if you're interested.

Â 3:02

Now, I realize that this cubic running time is, [SOUND] maybe, not super

Â impressive. In general, as we've been talking about

Â dynamic programming algorithms, we've been starting to see, for the first time

Â in these courses, a proliferation of algorithms.

Â Which, you know, while polynomial time, while way better than exponential time,

Â you wouldn't really call blazingly fast. We've seen a lot of quadratic running

Â times. Now we're seeing some cubic running

Â times. So, that's kind of a bummer, but, I am

Â still teaching you the state-of-the-art. As we mentioned this in the last video,

Â it remains an open question to this day, whether there are significantly.

Â [SOUND] Of cubic algorithms that solve the all pairs shortest paths problem.

Â So now, let's formalize the optimal substructure in all pair shortest path

Â which is exploited by the Floyd-Warshall algorithm.

Â A quick digression first though. At the beginning of the class I hope that

Â you know, if I had shown you this Floyd-Warshall algorithm, you would have

Â said wow, that's a really elegant algorithm.

Â How could IU have ever come up with this? Whereas now that you're approaching the

Â black belt level in dynamic programming and you see just how mechanically the

Â algorithm, this famous algorithm is going to fall out of the really quite

Â easy optimal substructure. I hope you say, how could I not have come

Â up with this algorithm? Now I don't mean to take any credit away

Â from the solutions of Floyd and Warshall. There is one really great idea in how

Â they phrase this optimal substructure. Remember we talked about this before the

Â Bellman-Ford algorithm, it can be tricky to apply dynamic programming to graph

Â problems because there isn't really an ordering in the input.

Â You just get this bag of vertices and this bag of edges.

Â And it's often unclear how to define bigger and smaller subproblems.

Â 4:39

So the really nice solution in the Bellman-Ford algorithm was to introduce

Â this extra parameter I. I was a budget on the number of edges or

Â roughly equivalently the number of vertices that you are allowed to use in a

Â path. Between the origin and destination for a

Â given subproblem. This naturally induces an ordering on

Â subproblems, the bigger the edge budget, the bigger the subproblem.

Â We're going to do something similar but even a little bit more stringent in the

Â Floyd-Warshall solution. We're again going to restrict the number

Â of vertices that can appear in the middle of the path between a given origin and

Â destination and a subproblem. But more than just the number of the

Â vertices we're allowed to use, we're going to restrict exactly which vertices,

Â the identities of the vertices that the algorithms permitted to use to get from

Â the given origin to the given destination.

Â 6:12

So let me start now setting up the optimal substructure limit.

Â In contrast to the Bellman-Ford case, I'm only going to prove the optimal

Â substructure limit of four input graphs that have no negative cycle.

Â We'll address the case of negative cycles after we're done with the algorithm.

Â Suppose for now there isn't one. So what then are the subproblems going to

Â be? Well it's going to be quite analogous to

Â Bellman-Ford. In Bellman-Ford we were solving a single

Â source shortest path problem so we needed to compute something for each

Â destination. So that gave us a linear number of

Â sub-problems and then for a given destination we had this parameter I

Â controlling the edge budget. So that was another linear factor, so we

Â had a quadratic number of subproblems. Here it's going to be the same except we

Â also have to range over our own origin. So we're going to get a cubic number of

Â subproblems, specifically a choice for the origin, N choices, a choice for the

Â destination, that's N more choices plus. Which prefix of vertices we're allowed to

Â use. So that's still another N choices.

Â So cubic total. So now we focus on an optimal solution to

Â this sub problem. By which I mean amongst all paths which

Â begin at the vertex I, end at the vertex J, and strictly in between I and J, as

Â internal nodes contain only vertices from one through K.

Â Amongst all those paths look at the one with the shortest length.

Â And because we're looking at the case of no negative cycles, we can assume that

Â this path has no cycles. It's a cycle free path.

Â So rather than state the conclusion of the optimal substructure lemma right now,

Â let me just make sure you understand what one of these sub-problems is.

Â So, lets suppose that the origin is the vertex that we've, aren't labelled

Â arbitrarily seventeen, the destination J is the vertex we've labelled ten, and

Â let's suppose K at the moment is only five.

Â Consider the following graph. Imagine this is a little piece of some

Â much bigger graph. Now I hope it's clear what the shortest

Â path is between seventeen and ten. It's the bottom two hop path, that path

Â has total length minus twenty. On the other hand, for the sub problem

Â we're dealing with, K equals five. So what that means is we have this extra

Â constraint on which vertices can we can use in the middle of our paths.

Â Now to be clear, this constraint K at most five that can't apply to origin of

Â the destination. Right?

Â I is 17, J is 10, both of those are bigger than five, but we don't care.

Â Obviously, any path from I to J, has to include both the vertices I and J.

Â So the constraint only applies to vertices in the middle of the path.

Â But, this bottom two hot path, unfortunately makes use of the node

Â seven. So that path does not obey the

Â constraint, it is not at most five, so that path is disallowed.

Â Can not use it. Therefore, the shortest path from

Â seventeen to ten, that makes use of only the first five The five labeled vertices

Â as intermediate ones would be the top three hot path, which has the bigger

Â length of three. Oay, so I hope it's clear what a c-

Â sub-problem corresponds to. It corresponds to a choice of I, the

Â origin. That's again just some vertex labeled

Â something from one to N. Similarly there's a choice of the

Â destination, some other vertex J which is labeled something from one to N.

Â And then there's a choice of the bound K. Which governs which vertices you're

Â permitted to use internal to your path. So again the constraint doesn't apply to

Â the end points, just the vertices in between the end points.

Â And in the subproblem the choice of K says you can only use vertices one

Â through K, internal to your paths. So hopefully that makes sense, let's move

Â on to the full statement of the optimal substructure lema.

Â 9:42

All right, so this is actually going to be one of the simpler optimal

Â substructural lemma that we've seen in a while.

Â We're back to the glory days of only two cases.

Â The first is trivial, if the shortest path from I to J using only vertices one

Â through K as intermediaries doesn't even bother to use the vertex K, well then it

Â better well be a shortest path from I to J that uses internal nodes only from one

Â to K-1. Now it's in case two it's going to be

Â apparent what a great idea ordering the vertices looking at prefixes.

Â The vertices is when you're considering the all pairs version of the shortest

Â path problem. Suppose this path P from I to J does

Â indeed use the vertex K in the middle. Well than we can think of the path P as

Â comprising two sub paths. The first path P1 which starts at I and

Â goes to the vertex K. And then a path P2 which starts at K and

Â goes to the destination J. Now here's what's cool.

Â So on this path P the internal nodes between I and J all lie in one through K.

Â Moreover, this path P remembers cycle free so the vertex K appears exactly

Â once, not more than once. So therefore if we split the path P into

Â these two parts, P1 and P2, internal to P1, strictly in between I and K.

Â There's only vertices between one and K-1.

Â Similarly strictly between K and J and P2, there's only vertices between one

Â through K-1. Thus both of these paths, P1 and P2 can

Â be thought of as feasible solutions to smaller subproblems.

Â Subproblems with an even tighter budget of K minus 1 on the internal nodes that

Â are allowed and these aren't just feasible solutions for smaller

Â subproblems, they're optimal solutions as well.

Â So, how slick is that? That the maximum indexed, internal node

Â just splits this shortest path into two shortest paths for smaller sub-problems?

Â And, you know what? At this point, I am so confident in your

Â facility with dynamic programming, I am not going to even bother to prove this

Â statement. You are totally capable of proving this

Â yourself. The argument, it's really quite similar

Â to Bellman-Ford. I encourage you to do it as an exercise.

Â [SOUND]

Â