Let's turn to the Analysis of a Dynamic Programming based Heuristic for the Knapsack problem. In particular, let's understand precisely how the running time of this Heuristic depends on the desired accuracy epsilon. Let me jog your memory, what are the two steps of our Dynamic Programming based Heuristic? The first thing we do is we massage the item values. We round them so that they are relatively small integers. Precisely for each each item, i, we define v hat to be what you get by dividing the item size by m and then rounding down to the nearest integer. In step 2 we invoke our second dynamic programming solution for the knapsack problem, and we pass to it these new values the v hats as well as the original item sizes and the original knapsack capacity. This algorithm, as stated, is underdetermined. That's because I haven't told you how to set the parameter m. We do at least understand qualitatively the role of m and how it influences the choice in the accuracy running time tradeoff. Specifically, the bigger we set m, the more information we lose when we round the original values, the vi's to the v hat i's. More loss in information should translate to less accuracy. Larger m less accuracy. On the other hand the bigger we take m, the smaller the transformed item values the v hats. Remember the running time of our second dynamic programming algorithm is proportional to the largest item value so smaller item values translates to faster running times. Summarizing, the bigger the m the less the accuracy, but the faster the algorithm. Here's our plan for the analysis. We begin by studying the accuracy constraint. Remember the setup. The client is giving us a positive parameter epsilon. And it is our responsibility to output a feasible solution whose total value is at least 1 minus epsilon times that of an optimal solution. This constraint on our algorithms accuracy should translate to an upper bound on how large a value of m we can get away with. Remember, as we jack up m, we lose accuracy. So the first question we're going to study is, how large in m can we get away with while still being guaranteed to compute a solution within 1 minus epsilon times that of an optimal one. Once we solve this problem and we understand how large a value of m we can get away with, how aggressive we're permitted to be with the scaling, we'll then evaluate the running time of the resulting algorithm for that particular choice of m. To answer this first question, how big can we take m subject to an accuracy constraint of one minus epsilon its important to understand in detail how the rounded item values, the v hats compare to the original item values the v's that's what this quiz is meant to ask you to think about. Remember how we obtain v hat i from v i, first we decrease it to the next multiple of m, then we divide it by m. So we can think of m times v hat of i as what we have just after we've rounded down to the nearest multiple of m, and before we actually divide it by m. So how did we get to this m times v hat i? We decreased vi down to the next nearest multiple. So we decreased it by somewhere between 0 and m. So that was somewhere between vi minus m and vi. So let's now assemble a few preliminaries crucial for the accuracy analysis. So what did we learn from the quiz? We learned that for each item i, the rounded value v hat i, scaled up by m, so intuitively this is after we've rounded vi down to the nearest multiple but before we divide it, that lies between vi as an upper bound and vi minus m as lower bound. So let me extract these two different inequalities, one which expresses an upper bound on vi in terms of v hat i and then one which expresses the lower bound from vi in terms of v hat i. So there are two inequalities which are immediate from this fact, let me go ahead and give them names 1 and 2. We'll put these to use in the next slide. So first of all, the value vi, well that's lower bounded by m times the rounded value v hat i. And then v hat i, we can lower bound that by vi after we subtract m from it. So we'll call those inequalities 1 and 2. So these first two inequalities they're important for us. They don't have a lot of content, they're just sort of immediate from our definition of step 1 of our algorithm from the way we do our rounding. But step 2 of our algorithm gives us a third inequality which is much more powerful, and this says well look, in the second step we solve a Knapsack instance optimally. It's true we solve it for the wrong values. We solve it for the v hats but for these rounded values of the v hats the solution we come up with is better than any other. In particular if we measure a value using the rounded values, the v hats, then the solution s output by our heuristic is even better than the solution s star optimal for the original problem. Optimal for the vi's. That is, if we sum up the v hat i's of the items in our solution s, and we compare it to the sum of the v hat i's in any other solution. In particular, the solution s star, optimal for the original instance. We count out ahead, our sum of values is bigger. That's going to be our inequality number 3. These three inequalities are pretty much all we got going for us. But fortunately they are sufficient to solve the problem that we asked. To determine how large an m we can get away with subject to guaranteeing an accuracy of 1 minus epsilon. To see this let's just follow our notes, let's just see what happens if we chain these three inequalities together. Now the third one seems to have the most content, that really says we're optimally solving this problem with the transform value is the v hat. So let's start out by just writing down yet again, inequality number three. So inequality three just says that the sum of the transform value is the sum of the v hats in our heuristic solution capital S, is at least as good as any other solution, in particular, that of the solution s star, optimal for the original problem with the original item values. Inequality 3 is a fine starting point. Fundamentally what we're trying to do here is lower bound the performance of the solution S spit out by our heuristic and that's one this inequality does. The one problem is that it's justifying the performance of s in terms of the wrong data, in terms of the transferred values of the v hats. Whereas what we really care about are the original values of the v's. So we need to take this inequality 3 as a seed ,and grow from it a chain of inequalities, rewriting on both the left-hand side and the right-hand side these transformed values of the v hats in terms of the original values of the v's. And we'll use inequalities 1 and 2 to do that. As we grow the chain to the left, we want quantities that are only getting bigger and bigger. As we grow the chain to the right, we want quantities that are only getting smaller and smaller. Hopefully at the end of the day when the dust settles, we'll have an approximate optimally result for our heuristic solution s. Now if you look back at inequalities 1 and 2, you'll see that both are in terms of their quantity m times v hat, so it's going to be convenient to multiply by sides of this inequality by m. Of course it's still valid if I do that. To extend this chain of inequalities on the left-hand side, we need to tack on a quantity which is only bigger. That is we need to upper bound m times v hat i in terms of the original value vi. But inequality 1 simply says that indeed, m times v hat i is never bigger than the original value vi. So we simply apply inequality 1 to each item in the heuristic solution capital S, and we get an upper bound of sum over the items, and our solution s of the value of that item. So that's great. We get a lower bound, on the performance of our solution, that's exactly what we want. To grow the change of inequalities to the right we need a quantity which is only smaller, that is we want to lower bound m times v hat i, in terms of some function of the original value v sub i. Now, v sub i can be bigger than m times v hat i. Remember, we route it down but inequality 2 says it can only be bigger by at most m. So if we subtract m from v i, we get a lower bound on m times v hat i. So we just apply that term by term to the optimal solution, s star. So just by following our nose in this way, we get exactly what we want. We get, on the left hand side what we really care about, namely, the total value of the solution output by our heuristic. And this is the total value with the original values, mind you, not with the transformed ones, with the ones we really care about. We get a lower bound on that total value, that's exactly what we wanted. In terms of the best case scenario, the total value of the optimal solution, and the only blemish is we pick up this error of at most minus m for each item i in the optimal solution s star. So let's just lower this down crudely. The optimal solution as star it can only have n items at most. So we pick up a minus m at worst for each of these in most and items. So if we subtract an error term of m times n that gives us a lower bound on the value of our solution s. So now that the dust has cleared, we see that we have a performance guarantee for our solution s in terms of that, of the optimal solution, s star, basically, we're off by this error term of m times n. Remember, m is this parameter that governs how aggressively we scale the item values, and it's controlling the running time accuracy trade-off. We were expecting this all along, we argued with the bigger you take m, the more information you lose. So the less act that you'd expect and indeed the extent that were off by the optimal solution is growing linearly in this parameter m. The question were striving the answer, remember, is how big an m can we get a way with, and still be sure that our solution is within a 1 minus epsilon factor of the optimal 1. But to achieve this we just need to make sure that the worst case error in our solution m times n is never bigger than an epsilon fraction of the optimal solutions value, that is we just set m small enough that m times n is not more than epsilon times the optimal value. So this inequality is meant to tell us how to set the parameter m in our algorithm. But you'd be right to complain that on the right hand side, there's a quantity here that we don't actually know, right? We don't actually know the optimal value of the optimal solution s star. That, after all, is what we're trying to compute, at least approximately, in the first place. So instead what we're going to do is just use a crude lower bound on how small the optimal solution value could be. Let's make a trivial assumption, let's assume that each item has a size at most the Knapsack capacity, that is, each items fits by itself inside the Knapsack. Any items which fail that test obviously can be deleted in the pre-processing step. Then, the optimal solution if nothing else, it's at least as large as the value of the max value item. So to satisfy the desired accuracy constraint, all we need to do is set m small enough so that m times n is at most the right hand side listed here. Of course, it's sufficient to set m even smaller so that m times n is no more than an even smaller number. So in our algorithm we just set m to be the number that equalizes m times n, the left hand side with our lower bound on the right hand side, namely epsilon times v max. So notice, these are all indeed parameters known to the algorithm. It of course, knows how many items n there are, it knows epsilon, that's the parameter supplied to the algorithm by the client and it's easy to compute v max, the maximum value item. So the algorithm is in a position to set m so that this equality holds. That is to set m equal to epsilon v max over n. That finally completes the description of the dynamic per mean based heuristic. So we've now completed the answer to the first question that we asked. How large an m can we get away with subject to the accuracy constraint of 1 minus epsilon? We can get away with m as large as epsilon times the largest item value divided by n. And now I hope that everybody is holding their breath and crossing their fingers just a little bit. because remember m not only governs the accuracy it also governs the running time, the more aggressively that we can scale the numbers the larger we can take m the faster the algorithm, and the question now is are we permitted to take m large enough that the resulting heuristic running time is polynomial? So the running time of the heuristic is certainly dominated by step 2, where we invoke our dynamic programming algorithm, and the running time there is n squared times the maximum item value passed to that subroutine. That is the maximum value of a scaled value, the maximum v hat i. So the big question then is, how big can these scaled item values, the v hat i's be? So pick your favorite i, how big can its transform value v hat i be? Well, how do we get v hat i? We take the original value vi, we round it down and we divide it by m. So the biggest certainly the v hat i could be is the original value of vi divided by m. That obviously isn't most of the maximum original value divided by m. Now we plug in our selection for m epsilon time v max over n. Very happily, the two v maxes cancel, leaving us just with n over epsilon. Plugging in n over epsilon as an upper bound on every transformed value passed to the step 2 dynamic programming solution, we get an overall running time of n cubed over epsilon. The running time does degrade with epsilon. The smaller the epsilon the bigger the running time. Of course you would expect that with any np complete problems. As you take epsilon smaller and smaller you need to take n smaller and smaller. To get the accuracy guarantee that resulted in less aggressive scaling of item values in turn, bigger transformed values get passed to the dynamic programming subroutine, resulting in the bigger running time. Nevertheless, this does show that the full continuum of running time accuracy trade offs is possible for the Knapsack problem. You want arbitrarily close approximation for this problem? You've got it.