Welcome to EIT University of Technology and welcome to this MOOC on approximation algorithm. My name is Mark de Berg and I will be your teacher in this course. Approximation algorithms form exciting topic and algorithms theory and in this MOOC you will learn some of the basic techniques and concepts from this area. So let's get started. In our first lesson on approximation algorithms, we're going to discuss some of the basic concepts, in particular, what is an optimization problem and why do we actually need approximation algorithms to solve approximation problems. Then we're going to formally define what a Rho-approximation is and this row is the approximation ratio which will be a very important concept in the analysis of approximation algorithms. So let's first look at an example. Suppose you have a server firm where you have a lot of compute servers and now you get a number of computational tasks that you have to schedule on these machines. Of course, you'll want to do this in such a way that the time at which the last job, the last computational task is finished, is as early as possible. Let's look at a different problem. Suppose you go hiking, you go hiking and you don't want to carry more than 20 kilograms in your backpack and there are many items that you may want to take on your trip. So what you can do is for each of these items, you can assign a value to them, how much is it worth to you to take them on your trip, how valuable are they for you on your trip and now you want to select a subset of these items such that the total weight of the subset is at most 20 kilograms and the total value of these items for you on your trip is maximized. So these are two optimization problems. In general, in an optimization problem, you have many different possible feasible solutions. For instance, for the backpack, for the tracking problem any subset of items whose weight is at most 20 kilograms would be a valid solution. Out of all these valid solutions you now want to take the best one, and that means that each of these valid solutions has a value associated to it and you want to find the one with optimal value. And that can either be the solution with a minimal value. This will be if the value represents some kind of cost that you have to pay or it could be the solution with a maximal value in case the value represents some kind of profit. In practice there are lots and lots of problems that are of this type, that are optimization problems, problems where you want to find either a solution that minimizes your cost or a solution that maximizes your profit. Unfortunately, many of these problems are very difficult and in technical terms, they are NP-hard. For us in this course, we're not going to look more formally at what NP-hard means but for us it's enough to know that if a problem is NP-hard then assuming P is not equal to NP, which is something that all computer science researchers believed to be the case, so then you cannot get a fast algorithm or more precisely it's impossible to come up with an algorithm that always computes the optimal solution and does so in polynomial time. So if it's not possible to find an optimal solution in polynomial time, what can we do? Well, the first thing we could do is we could develop what is called the heuristic. This is simply an algorithm that gives you a valid solution and where the value of the solution is typically close to the optimal value, hopefully. And then you could implement and you could test and observe that in practice it performs pretty well most of the time. What we're interested in this course is something where we can prove something about the quality of our solutions and that's where the concept of approximation algorithm comes in. So an approximation algorithm is an algorithm that comes with a guarantee how close the value of your solution is to the optimal value. So let's try to define this a little bit more formally. So an algorithm, let's call it ALG, is an approximation algorithm if the following holds. So let's define ALG of I to be the value that the algorithm computes or the value of the solution it computes I should say when it gets an input I and let's say that OPT of I would be the optimal value for this particular instance. Then because our algorithm is an approximation ALG of I is typically not going to be the same as OPT of I. It's going to be a little bit worse. So how can we quantify how good it is? Well, that's using the notion of row approximation. So ALG is a Rho approximation algorithm and a Rho is just a number, let's say a 2-approximation, for instance, or 1.5-approximation if the following holds. And the definition I'm giving now is for minimization problems. Later we're also going to look at what the definition would be for maximization. But for minimization the criterion is that for any possible input instance I, the value of the solution that we compute, ALG of I is at most Rho times the value of the optimal solution. And this value Rho is called the approximation ratio. Obviously, the approximation ratio for minimization problem can never be smaller than one because it's impossible to compute a solution whose cost is smaller than the minimum possible cost. So this Rho is at least one, and actually if it will be exactly equal to one, you would always get the optimal solution. So for approximation, this Rho would be bigger than one, and hopefully it would be close to one and in that case we have a very good approximation. So let's look at an example. In this load balancing problem where we have to distribute jobs over machines, what is a 2-approximation algorithm? Well, it's an algorithm that computes a valid distribution, so it assigns all the jobs to one of the machines in such a way that the total time it takes until all the jobs are finished is at most two times the optimal, the minimal possible time. If we would have instead of a 2-approximation a 1.1-approximation this would be even better because now the total time it takes is at most 10 percent more than the optimal solution. Let's do a little quiz to see if you understand the concept, the definition of Rho-approximation. So suppose there's a professor, Professor Smart who has designed an approximation algorithm for load balancing and he claims that he has proved that this is a 2-approximation. Now there is a student who actually says well, this algorithm is a 1.5-approximation. So is it the case that if the professor has really proved that this is a 2-approximation, that then the student must be wrong? Right, if it's a 2-approximation it cannot be a 1.5-approximation. Or is it may be the case that both can be right? That the professor has proved it is a 2-approximation but the student who claims that it's a 1.5-approximation is also right. So the algorithm would be at the same time a 2-approximation and the 1.5-approximation algorithm. So think a little bit and try to remember the definition of Rho-approximation. So the solution in this case is answer B. It could be that both the student and the professor are right. Why is this the case? Well, remember the definition of Rho-approximation that says that the algorithm should be at most Rho times the optimal. So here if it's a 2-approximation then for any possible input instance, the value of the solution is at most two times the optimal value. Well, if it at most two times the optimal value, it could still be that it's always at most 1.5 times the optimal value. Because obviously if it's at most 1.5 times the optimal, it's also at most two times the optimal. So it's very well possible that both the student and the professor are right. Now that we understand the definition of Rho-approximation for minimization problems, let's have a look at maximization. For maximization, what do I have to change in the definition? Well, for maximization we want to compute something that has the largest possible value. So now instead of saying that ALG of I is at most Rho times OPT, we want to say that it's at least Rho times OPT. That also means that this value of Rho, which for a minimization problem is bigger than one, would now be smaller than one. Let's look at an example in our Knapsack problem where we want to select a subset of the items of maximum value for us such that the total weight is, let's say, at most 20 kilograms, then what is a 0.5 approximation? Well, it's an algorithm that computes a solution whose value is always at least 50 percent of the maximum possible value. If we would have 0.9 approximation however, then what we have is something that is always at least 90 percent of the optimal value and obviously this would be much nicer. Okay, so let's summarize. So what we've seen is what are optimization problems, why do we actually need to work with approximation algorithms? Well, because most of the time these optimization problems are NP-hard and we've defined the concept of Rho-approximation and that formalizes how close to the optimal value our solution, the solution that our algorithm computes, is guaranteed to be. Next lessons, we're going to see some examples of approximation algorithms and we're going to see how to analyze them, look at some a special techniques that we can use to design them and so on.