0:00

In this video and the next, we're going to revisit the famous traveling salesman

Â problem. When we first talked about TSP, it was

Â bad news. The context was NP completeness.

Â We discussed Edmond's Conjecture. That, it's widely believed there's no

Â known polynomial time algorithm for solving the TSP problem.

Â But let's talk about some good news. The fact that you can do better than

Â naive brute-force search. In fact, it's going to be another neat

Â application of the dynamic programming algorithm design.

Â Paradigm. So let me remind you briefly about the

Â traveling salesman problem. The input, very simple, just a complete

Â un-directed graph, and each of the end choose two edges has a non-negative cost.

Â The responsibility of the algorithm is to figure out the minimum cost way of

Â visiting each vertex exactly once, that is you're supposed to output a tour, a

Â permutation on the vertices. That minimizes the sum of the

Â corresponding end edges. For example in this four vertex pink

Â network the minimum cost towards overall cost thirteen.

Â You could of course solve the problem using root four search, the running time

Â of root four search would be in factorial.

Â Tutorial. This would allow you to solve problems

Â with say, 12, 13, maybe 14 vertices. In this video, and the next, we'll

Â develop a dynamic programming algorithm that solves the TSP problem.

Â Now, of course, TSP is NP complete. We have to expect that.

Â We're not expected a polynomial time algorithm.

Â But, this dynamic programming algorithm will run quite a bit faster than

Â brute-force search. The running time will be big O of n^2

Â times 2^n. 2^n is of course exponential but it's

Â quite a bit better than n factorial. n factorial is more ball park n^n.

Â Okay, if you look at [INAUDIBLE] approximation you'll see it's really n

Â over a constant raised to the n but still that's much much bigger than a constant 2

Â raised to the n. To make it more concrete, you could run

Â this dynamic programming algorithm for values of n, probably pushing 30 or so.

Â So I realize these are still pretty absurdly small problem sizes, compared to

Â the size of the arrrays, that we can sort.

Â Compared to the size of the graphs on which we compute strongly connected

Â componenets, or shortest paths, but, that's how it goes with NP-complete

Â problems. You have to work pretty hard, even to

Â solve modest sized Problems. So at least this dynamic programming out

Â rhyhm proves the point. That even for empty complete problems,

Â there are opportunities to improve over brute-force search and the programmer

Â tool box I've already equipped you with is efficient to make some of those

Â improvements. Increments.

Â For the traveling salesman problem, in particular, it's been a benchmark problem

Â for the development of tools and optimization, and people have worked very

Â hard, to solve very large instances of the traveling salesman problem.

Â Even with, 10's of thousands of cities, even sometimes over 100,000 cities.

Â But, I mean, this often represents years of work.

Â As I said earlier there's some really nice Some popular accounts of the

Â Traveling Salesman Problem books out there.

Â Check it out if you want to know more about the state of the art for solving

Â large TSP instances. So say we're going to pursue a dynamic

Â programming approach to the TSP, so that means we should be looking for optimal

Â substructure. A way in which an optimal must

Â necessarily be composed of an optimal solution to a smaller sub-problem

Â extended in some easy way to the original problem.

Â So what could that look like for TSP? So of all of the programs that we've tackled

Â using dynamic programming I think the 1 which seems the most like TSP is single

Â source shortest paths. So think of a tour, not as a cycle

Â exactly but as a path from some vertex, let's say vertex number 1 looping all the

Â way back to itself, with, of course, the constraint that it should visit each

Â other vertex exact. Once on root.

Â We want to minimize the overall cost of this path from 1 back to itself, so that

Â sounds a lot like wanting to minimize the length of a path from some source to some

Â destination, and I'm sure you'll recall that our dynamic programming algorithm

Â for the single source shortest path problem was the Bellman Ford.

Â So what then did the sub-problems look like in the Bellman-Ford algorithm.

Â Well, the cool idea there was to measure sub-problem size using an edge budget, so

Â a cannotical sub-problem with Bellman-Ford was.

Â To compute the length of a shortest path from the given source vertex to some

Â destination vertex v. That has I edges or less.

Â So, by analogy, we could think here about subproblems, where we want the shortest

Â path from the starting node vertex, number 1, to some other vertex j.

Â That uses, at most. I edges.

Â So precisely, let's define sub-problems as follows.

Â Given choice I, this represents the number of edges that you're allowed to

Â use. Given destination J, let's assume that

Â vertices are They're just named from 1 to n, so I'll call it generic destination j,

Â an integer from 1 to n. Lets denote capital L sub i j, as the

Â shortest length of a path from the starting vertex 1, to this destination j,

Â using at most i edges. So suppose we try to set up a dynamic

Â programming algorithm using these sub-problems.

Â What do you think? Can we use these to get polynomial time algorithm for the TSP

Â problem? All right so the correct answer is C.

Â So this proposed collections sub problem does not yield the polynomial time

Â algorithm that is D is not the correct answer that will be surprising, right TSP

Â is an NP complete problem. So, if this give a polynomial time

Â algorithm we can all report directly to the Clay Institute for a million dollar

Â cash prize. Now there's a lot of students in this class, but at least we'd each get

Â 20 bucks or so out of the deal. So that's not the case, this is not going

Â to be problem time out with them. What's the problem, well the problem is

Â also not A, it's not that there are too many subproblems.

Â We only have o of n choice for i, and o of n choices for j.

Â So there's only a quadratic number of sub-problems.

Â Just like in Bellman-Ford, that's totally reasonable.

Â We also can correctly compute the, value of large sub-problems from smaller ones,

Â it's really exactly the same as what we were doing in Bellman-Ford.

Â The problem is that our semantics are incorrect.

Â By solving these sub-problems, we don't actually get an optimal solution to the

Â traveling salesman incident that we started with.

Â So what are we hoping is going to be the case? Well, in our dynamic programming

Â algorithms thus far, after we solved all of the sub-problems, the answer to the

Â original question was just lying there waiting for us in the biggest

Â sub-problem. So here, the biggest sub-problem would

Â correspond to i equaling n, where you allowed to use up to n edges in your

Â path. The issue with these subproblems is they

Â only specify an upper bound I. And the number of edges you're allowed to

Â use. The dude does not enforce that you have

Â to use your full budget of I edges. So that means when we look at these

Â biggest sub problems. Yeah the shortest paths could use us as

Â much as N edges if they wanted. But in general, they won't.

Â They'll use much fewer than N edges. They'll skip tons of the vertices,

Â and that's not a traveling salesman tour. A traveling salesman tour has to

Â [INAUDIBLE]. Every vertex wants that is not enforced

Â by these sub-problems. But that problem doesn't seem so hard to

Â fix. Let's just insist in each sub-problem and

Â instead of using most I-edges and the shortest path.

Â It has to use exactly I edges and the shortest Path.

Â So in this set of sub problems there's not quite as misguided as the previous

Â one, the answer remains the same. The answer is still C.

Â Of course, we still don't get polynomial time algorithm.

Â We're not claiming to prove P = NP. The number of sub-problems hasn't

Â changed. It's still quadratic, and if you think

Â about it, you can still efficiently solve, for small, bigger sub-problems,

Â with a bigger edge budget, given the solutions to all of the smaller

Â sub-problems. So what's the issue? The issue is that

Â our semantics are still incorrect. Just because you solve all of these

Â sub-problems doesn't mean you can extract from it the cost of a minimum cost

Â Traveling Sales. Has been told.

Â So the problem briefly is that we don't enforce the constrain that you can't

Â visit a vertex more than once. What would be hoping would be true, we're

Â hoping and then we look at the biggest sub problem.

Â So this is when I is equal to n and I'm looking at path that have exactly n edges

Â and when we take j the destination to be. One, the same as the origin.

Â We were hoping that that shortest path with images from one back to itself would

Â be a tour and therefore the minimum cost travelling salesman tour.

Â But that need not be the case. Just because this path has n edges and it

Â starts at 1 and it ends at 1 doesn't mean it's a tour.

Â It might for example visit vertices 7 and 23 twice, and as a result it never visits

Â vertices 14 or 29 At all. For this reason, the number computed by

Â the largest sub problem, when i = n and when j = 1.

Â That can be much, much smaller than the true answer for the minimum cost

Â traveling salesman tour. A good algorithm designer is nothing if

Â not tenacious. So, let's just recognize the flaw in this

Â proposal, that we're not enforcing. Fact that you can't have repeat visits to

Â a vertex, and let's just change the subproblems again to explicitly disallow

Â repeated visits to a vertex. So precisely let's index the sub-problems

Â in exactly the same way as before. There's going to be one for each budget i

Â and once for each destination j. For a given i and a given j, the value of

Â the sub-problem is now defined, as the length, of a shortest path, beginning at

Â one, ending at j. A, using exactly i edges, with the

Â additional constraint, that no repeated vertices are allow.

Â The one exception being that if j = 1, then of course you're allowed to have 1

Â at the beginning and 1 at the end. But for the internal vertices, and the

Â shortest path, we do not allow, repeats. So what do you think? Does think refine

Â collection of sub-problems allow us to get a polynomial time algorithm for the

Â travelling salesman problem. So in today's quiz is more suttle than

Â the past couple and the correct answer is switched from C to B.

Â Its still the case that there is a quadratic number of sub-problems, its

Â still the case that we don't expect upon on our time algorithm but now that with

Â this different definition of sub problems they do capture the TSP.

Â Specifically, look at the biggest subproblem.

Â Take I equal to N, take J equal to 1. The responsibility of that subproblem is

Â to compute the shortest path from 1 to 1 that has exactly n edges, and no vertices

Â repeated internally. That is exactly the original problem that

Â we started with. That is exactly TSP.

Â The issue is that you cannot efficiently solve larger sub problems, problems with

Â a larger edge budget, given the solutions to the smaller sub-problems.

Â The smaller sub-problems are just not very useful for solving the bigger

Â sub-problems. And the reason is a little bit subtle.

Â Now the hope would be that like in all our previous dynamic programming

Â algorithms, you can just formulate a recurrence which tells you how to fill

Â in, how to solve your sub-problems given the solutions to smaller ones.

Â And there's even a natural guess, for these sub-problems.

Â What you hope the recurrence. Might be.

Â Recurrences generally follow from a thought experiment about what optimal

Â solutions have to look like. So you want to focus on a particular

Â subproblem, like given destination j and a given budget i on the number of edges,

Â you'd say, well let's think about an optimal solution.

Â So what is that? That's a path that starts at one.

Â It ends at j. It has exactly i edges.

Â It has no repeated vertices. And amongst all such paths, it has the

Â minimum length. And it's natural to proceed by analogy

Â with Bellman - Ford, and say, well, wouldn't it be cool if a little birdie

Â tells me what the penultimate vertex k was on the shortest path from 1 to j.

Â So if I knew that the second to last vertex on the shortest path was k.

Â Well then, surely, the length of this path would be the length of an optimal

Â path from 1 to k. Using exactly I-1 edges, and no repeated

Â vertices. With this final hop, kj concatenated at

Â the end. Now, of course, you don't know what the

Â second to last vertex k is. But no big deal.

Â As usual, in dynamic programming, we're just going to try all of the possibility.

Â So we can encode this root for search as a minimum over the sensible choices for

Â k. Clearly, k should not be equal to j.

Â It should be some other vertex that precedes j, and ignoring some base cases,

Â let's also preclude k from being 1. That's the starting vertex.

Â And of course, pictorially, what we have in mind, is we have in mind taking some

Â shortest path. From one to k so we'd look that up that

Â would be previously computed in some small sub problem and then can candidate

Â into it a final half that goes from k to j.

Â Well hey that sounds pretty good. Right, this is that kind of sound that

Â may be this is the key ingredient to polynomial time algorithm for TSP.

Â [INAUDIBLE]. So what's the problem? Well, the problem

Â is that we've defined our sub problems to disallow repeated visits to a vertex.

Â So when we compute a sub problem, we better make sure we're respecting that

Â constraint. That no repeated visits are allowed.

Â And if we have in our mind, concatenating a shortest path from 1 to K with no

Â repeats with this final hop. k to j.

Â Well this might result in a repeated visit to j.

Â In particular, for all we know, the shortest path from 1 to k, that uses

Â exactly i-1 edges, and has no repeated vertices, that path might well go through

Â j. In which case, concatenating this final

Â hop kj, results in the cycle of second visit to j.

Â