In fact the order of growth classifications are so important they've led to enormous amount of research in recent years and just talk briefly about that now. So there is, life is a little bit more complicated than pointed out in the last example and one problem is that the inputs can cause the performance of the algorithm to vary widely. So often we have to think about different ways of analyzing the algorithm depending on the input. So, the running time is going to be somewhere between the best case and the worst case. Best case is the lower bound on cost it. It provides something that the running time is going to be bigger than that always or not less than that and then there's the worst case which is the most difficult input. If we analyze that then we can guarantee that the running time in the algorithms not going to be bigger than that. And then in a lot of situations we might consider our input to be random. Well we need to, someway to model, what we mean by random for the problem that we're solving but there is a lot of situations where we can do that and then we have a way to predict performance even when the input might vary widely. So for example for 3-sum, it's kind of always the same. With the tilde notation, the only variability in that algorithm is the number of times the counter is incremented and that's in low order terms so it doesn't need to chew up in our analysis. For binary search it's, you might find the thing right away in which case is constant time and we can show that the average and the worst case are both lg based two(N). There's other, in another examples that be much more variability even. So, we have this different types of analysis depending on the input. And but the question is, what about the actual problem that the client is trying to solve? So we have to understand that two in order to be able to understand performance of the algorithm. And there's two approaches that are, or successful in this. One is to design for the worst case. Just to make sure that your algorithm are, always runs quickly and that's definitely ideal. Another is to, if you can't do that is to randomize and then depend on some kind of probabilistic guarantee and we'll see examples of both of these as we go through the course. Now, those kinds of considerations, you know the idea of order of growth leads to discussion of, what's called, what I call the theory of algorithms. And here our goals are, we have a problem to solve like solve the 3-sum problem and we want to know how difficult it is. We want to find the best algorithm for solving that problem. The approach that the computer scientist use for this is to try to suppress as many details as possible in the analysis. And so just analyze the running time to or within a constant factor. That's what order of growth is getting at and also I want to, not worry about the input model at all. And so we focused on worst case design and we can talk about performance of algorithms just in turn of the order of growth and it's actually possible, it's actually possible to do that in a very rigorous way that it's taught us a lot about the difficulty of solving problems. And our goal is to find an optimal algorithm where we can guarantee to within a constant factor certain performance for any input cuz we discovered the worst case but we also can have approved that didn't know algorithm could provide a better performance guarantee. I'll give a couple of easy examples of this. Now in order to do this they're, these commonly used notations called the big theta, big O and big omega notations. So the and those definitions are given here. So big theta notation is just the way to describe the order of growth. Theta(N)^2 is kind of short hand for anything N^2. It's bounded above and below by constant time N^2 and that's what we really use to classify algorithms. And then, there is big O notation which is upper bounds on performance. When we say O(N^2), we mean that it's less than some constant time N^2 as N grows. And big omega is used for lower bounds means greater than some constant time N^2 as N grows. So those three notations were able to use to classify algorithms and I'll show them in the following. So, examples from our 1-sum, 2-sum, and 3-sum are easy to articulate so our goals are to establish the difficulty of the problem and to develop an optimal algorithm. So, the 1-sum problem is 00 in the array. Well, an upper bound on the difficulty of the problem is some specific algorithm. So, for example, the brute force algorithm that looked, that looks at every array entry is a specific algorithm and it means that and that takes O(N) time. We have to look at every, it's less than a constant time N for some constant. So, the running time of the optimal algorithm has to be O(N) that is that's specific algorithm provides an upper bound on the running time of the optimal algorithm. And but in this case it's also easy to develop a lower bound, that's a proof that no algorithm can do better. Well, for 1-sum you have to examine all entries in the array. If you miss one, then that one might be zero so that means that the optimal algorithm has to have a running time at least some constant times N where we say the running time is omega of n. Now in this case, the upper bound and the lower bound match. So, doing the constant factor so, that's a proof that the brute force algorithm for 1-sum is optimal. It's running time is theta(N). It's both omega and O(N). That's, for that simple problem it was okay to get the optimal algorithm. For a more complicated problems it's going to be more difficult to get upper balance and lower balance and particularly upper balance and lower balance that match. For example let's look at 3-sum. So, upper bound for 3-sum, say our first brute force algorithm, say that the proof, was a proof that the running time of the optimal algorithm is O(N^3) but we found a better improved algorithm. Whose running time is O(N^2) lg N. So, that's a better upper bound. Lower bound well, we have to examine all entries cuz again, we might miss one that makes 3-sum = zero and that's a proof that the running time in the optimal algorithm is omega(N) but nobody knows higher or lower bound for 3-sum. So there's a gap between the upper bound and the lower bound and open problems. Is there an optimal algorithm for 3-sum? We don't know what it is. We don't even know if there's a algorithm whose running time is < O(N^2) or we don't know higher lower bound and linear. So that's an example of an open problem in the theory of algorithms we don't know how difficult it is to solve the 3-sum problem. Now, this point of view has been extremely successful in recent decades. We have a new problem, develop some algorithm, proves some lower bound. If there's a gap, we look for new algorithm that will lower the upper bound or we try to find a way to raise the lower bound. Usually it's very difficult to prove non-trivial or lower bounds. Trivial or lower bound like look at every input items is not so hard non-trivial lower bounds like for example, the proof that we're talking about for Union-find problem are much more difficult. And in the last several decades people have learned about the computational difficulty of problems by examining steadily decreasing upper bounds so the algorithms were better worst case running times for lots and lots of important problems and plenty of optimal algorithms and plenty of gaps still remain. It's a fascinating field of research that many people are engaged in. Now there is a couple of caveats on this on the context to this course. And the first one is maybe it's overly pessimistic to be focusing on the worst case. We've got data out there. We've got problems to solve. Maybe it's not worst case data and lots of fields of engineering and science. We don't focus on the worst case. The worst case for this course would be lightning to strike and it would be over so we don't plan for that. And since similar it's true for algorithms. Maybe we should be focusing on understanding prope rties of the input and finding algorithms that are efficient for that input. And the other thing is in order to really predict performance and compare algorithms we need to do a closer analysis than to within a constant factor. So we talked about the tilde notation in the big theta, big O, and big omega, omega that are used in the theory of algorithms. And really there's so much published research in the theory of algorithms that a lot of people make the mistake of interpreting the big O results that are supposed to give improved upper bounds on the difficulty of the problem as approximate models for the running time and that's really a mistake. So in this course, we're going to focus on approximate models by, you know making sure that we use the tilde notation and we'll try to give specific results for certain quantities of interest and the constant, any unspecified constant in the running time. We'll have to do with properties in the machine and in the system so they will be able to use these results to predict performance and to compare algorithms.