Running experiments is one thing, but now we're going to look at an even stronger idea that we can develop mathematical models that describe the performance of our programs that we can use as a check and it's a better way to understand the programs. So the question is, can we write down an accurate formula for the running time of a computer program? In the 1960s, the prevailing wisdom was, "No, it's much too complicated." Computer is extremely complicated device and developing such a model would be much, much too difficult. But in the late '60s, Knuth showed that actually, it is something that we can do. He wrote a series of four books that did so for many, many important computer algorithms. This idea was just very straightforward. We determine the set of operations that the program is going to use to get his job done. Then we find the cost of each one of the operations. In those days, there was a manual that would tell you exactly how much time each instruction would take, and nowadays, there's ways to find that out, depends on the system software and other things. Then you find the frequency of execution of each one of the operations. That's going to depend on what the input is and it's going to depend on the algorithm. Then you find the total cost by just adding the cost times the frequency for all the operations, and yes, it can be complicated, but there are also ways to get useful approximations that make it quite feasible and quite useful to develop mathematical models even now, and that's what I'm going to talk about next. So, just as a warmup for this, let's say we wanted to solve the 1-sum problem. So that's in the array, how many zeros are there? So now, there's just one for a loop. We're just counting the number of zeros in the array. So, if we look at this program or breakdown the operation of this program, the first thing to note is that the number of times that one instruction executed count plus plus actually depends on the input. It could be no times, so there's no zeros or it could be N times, if there's all zeros. So, that's a complication that for some programs like this one, we have to take into account. So, what's a mathematical formula for the total running time of this program? Well, in this case, we break it down to seven operations: the function call/return, declaring the variable, assignment statement, compare that ones up at less than, equal to compare, accessing an array, and incrementing. For each one of these, we can estimate the actual cost of performing each one of these operations. Now, there is some poetic license here nowadays. Decades ago, you could actually state these costs with certainty because computers wouldn't work unless things took exactly that amount of time, and now there can be a little more leeway in some of these numbers, but still you can do some study and experimentation to get the exact values if you need to. So, that's the cost of each one of the operations, and then over in the rightmost column is, how many times each operation is executed? There's only one function call/return, this two variable declarations, and so forth. There's a N plus one less than compares N equal two compares N array accesses, and then the number of increments while you have to increment I, you have to increment count, and count can be between zero and N, so that one is a variable. So, that's the cost and that's the frequency. So, the total running time is to go ahead and just multiply these together and add them and the result is that we get c times N plus 26.5, where c is a number between two and 2.5 depending on the input. If we postulate that zeros are rare in the input, we could say maybe it's about 2N plus 26.5. But that's an example of Knuth's methodology of determining the cost of each operation. Determining the frequency of execution will blend together and add them up. So now, let's look at a slightly more complicated example 2-sum. So now, we want to find if there's a pairs of numbers that sum to zero so we have a double index for our loop. This one we know how to solve much faster, but again our topic is studying the programs not studying the fastest way to solve the problem. So, let's try to get a formula for the total running time of this 2-sum program. Again, this is the same kind of table, but now the frequencies of operation are more complicated, so there's N times N minus one over two different pairs, i and j between zero and N. So, that's the source of these expressions over here for the frequency of operation. Details of deriving these aren't that important right now, but they're tedious to get them exactly right, but so anyway, here's the table that we get and once you know that number you can work out the others. So, we can go ahead and add it up and now we've got a more complicated definition of what the running time is because this variables between of the number of times the increment happens affects all the things, and so you could come up with a formula, but what we need to do is simplify the calculations a little bit. So, what we're going to do is with that is, we're going to just concentrate on the part of the formula that grows the fastest is the most significant, and we're going to ignore everything else. To do that, we use a notation known as the Tilde notation and all that means is we write f of N tilde g of N means that their ratio approaches one as N approaches infinity. Now, it doesn't precisely say how fast that approaches one, and so forth, but still it's quite useful for the kind of calculations that we're doing right now. So, for example, if I have a formula like five fourth N plus 13 fourth N plus 53 over two, that first term, as the N grows, grows very much faster than the other two terms. So, we'll just say it's tilde five fourth N squared. So, just to see how close, let's say, that N equals 1,000, the thing on the left is 1,253,276.5, and the thing on the right is 1,250,000, but that's within 0.3 percent and that's definitely good enough for our purposes in trying to understand the running time of a program. So, the idea is that if N is large then the terms that we ignored are negligible and if N is very small then everything's negligible, doesn't matter so much, and by throwing away small terms, we're able to more easily work with this formula. So then, for 2-sum, we can say, we can postulate that the count's not large, maybe that's a factor of our application, which is often true. There's not that many pairs that sum to zero. So, that's going to be in some lower-order term, and that eliminates our dependence on the input so now, we can say that that thing takes tilde 5/4 N squared nanosecond. That's the kind of expression that we can write down for the running time with just one term, the fastest-growing term. So finally, let's do it for 3-sum. So now, we can apply our tilde operations even when counting out the frequencies, forget about the lower terms because we know we're going to multiply them by a constant and add them together and again, assume that count is not large to take the input out of the equation. It's all based on N choose three is tilde N cubed over six, then these things are much easier to compute and now multiply an atom, we get it immediately, tilde N cubed over two nanoseconds and that's a mathematical formula that we derived just by looking at the program and that model tells us where we looked at the properties of the computer and the program should be tilde N cubed over two nanoseconds, and you may remember that our empirical hypothesis said actually almost exactly that 0.484 N cubed nanoseconds. So, that's a very good situation when we have our mathematical model in agreement with our empirical model. So, let's look at the context of that, this is really what the scientific method is all about. Developed in the 17th century and later by European mathematicians and scientists. Whole ideas to observe some feature of the natural world, come up with a model that's consistent with observations or experiments and that's a hypothesis, predict events using the hypothesis, and then verify those predictions by making further observations, and then we validate that and refine until our hypothesis and our observations agree. That model can come from experiments, but it also can be a mathematical model. For programs, the feature of the natural world that we're studying is the time that it takes the program to run on a computer. Then, we fit our curve to get the formula for the running time and that's useful for predicting, but not really for explaining. With mathematical analysis, we study the algorithm, we get a formula for the running time as a function of N. That really helps us understand what the algorithm is doing and it's also another validation of the model. So, when we can do what we just did in this case, where we can run experiments and develop a mathematical model that agree, then we can have quite a bit of confidence in our understanding and in the predictions that we make based on that understanding. Now, it is the case that we might need some advanced mathematics and we'll talk about that later, but the advantage of the mathematical models that really applies are going to apply to any computer at all. The good news is that it's often the case that our mathematical models are easier to formulate than in other sciences because really, the operation that the object that we're studying or computer programs are built from relatively few primitives like loops and conditionals that are put together. Now, sometimes it can be advanced, but usually we can say that it's maybe easier than in other sciences, but we'll take a look at that next.