You've learned about two ways to search for a value in a list. One of them, linear search, will work on any list, regardless of whether the lists are sorted. Binary search requires that the list is sorted but, as you've seen, it looks like it can be a lot faster than linear search. Let's poke at these two algorithms a little bit more, so that we can get a feel for how good they are and when we would use one or the other. Computers play a lot of tricks in order to make code really fast, so it's hard to know exactly what to attempt. However, we do know that there is a comparison that happens each time through the linear search while loop, and that there's an assignment statement. So let's count these comparisons and the assignment statements that happen each time through a loop. For linear search, we see that there are two that happens each iteration. One comparison and one assignment statement. Again, this is a rough, rough estimate of what goes on in code, but it turns out that this rough estimate is going to be good enough for us to conclude that binary search is just wildly faster than linear search. At least when lists are large. In binary search, we have a much simpler comparison. But we'll still count it as just one. We have one assignment statement, a second comparison, and then the result of that comparison will lead us to one assignment statement or the other. So, we have one comparison here, one assignment statement, another comparison. And then one of these is going to happen. So in binary search, we actually have sort of four steps on each iteration of the loop. Because the initialization and the if statement, at the end of each, happen only once, no matter how many items there are on the list. And there are about the same number of steps in linearly search and binary search. We're going to ignore these in our analysis. In the linear search then, we figured there was roughly 2 steps per iteration, and in binary steps there was roughly 4 steps per iteration. We're going to compare these algorithms in the worst case. That happens when the value is not in the list, for the linear search. Here's an example for linear search if my list is 1, 2, 3, 4, 5, so there are 5 items in the list and am looking for value maybe 6. Then I look here, here, here, here, and here. And then I'm done. So that looked at all 5 items. Let's make a table of the number of items and the number of iterations. If there's one item, then there's one iteration. If there's two items, there's two iterations. In general if there are k items, there are k iterations. And now for binary search. If there is one item, then on the first iteration, we said m, we do a comparison, we move either b or e. And then the loop terminates. Let's start building our table. If there is one item then there's one iteration. If there are two items, then were going to throw away one half of the first time to the loop leaving us with one item. And we know how many steps that's going to take. So if there were two items, it takes us two iterations. Here's where things start to get kind of interesting. I'm actually going to skip 3 and go right to 4. If I have 4 items in my list, then in one comparison, I'm throwing away about half. And that means that I'm, I end up with a two item list. Well, it took me one step to get to this point, so that means that I just have one more iteration than I had when I had two items in my list. This means that I can process twice as many items with just one more iteration. Using this knowledge, let's fill in a few more rows of the table. 8s twice 4, 16 is twice 8, 32 is twice 16, 64 is twice 32, and 128 is twice 64. With 128 values, we can find the index of v in this list in only 8 iterations. Linear search would take 128. Let's plot this on a graph. Our axes are the size of the list, or the number of items, and the number of iterations. If the list has 1 item in it then there is 1 iteration. 2 items, 2 iterations. 4 items, 3 iterations. 8 items and 4 iterations. When we double the size of the list, we increment the number of iterations by 1. So thinking about this backwards, if I have 128 items in my list, then after one comparison, one iteration of binary search, I end up with only 64 items left to consider. When I have 64 items left to consider, then in one iteration I'm throwing away half of them so that after that iteration I only have 32 items to consider, and so on. This is awfully powerful. Mathematicians have a name for this function. It's called the logarithm base 2 of the number of items. Here's one more way to think about logarithms. It's the number of times you can divide by 2 in order to reach 1. So, how many times can I divide by 2 by 2 in order to reach 1? Once. What about 4? I can divide 4 by 2 to get 2. And 2 by 2 to get 1. That was 2 divides. 8 is going to be 1 division by 2 you got 4. 4 divided by 2 is 2. 2 divided by 2 is 1, so that gives us 3 divisions. 16 is 4 divisions. 32 is 5 divisions. And so on. So, let's try to calculate the logarithm of 1,024. Well, 1,024 is 512 times 2. When we divide 512 by 2, we get 256. Then, 128. Then 64, then 32, then 16, then 8, then 4, then 2, and then 1. So there's one division, two, three divisions, four divisions, five divisions, six divisions, seven divisions, eight divisions, nine divisions, ten divisions. So the log base 2 of 1024 is going to produce, what did I say, one, two, three, four, five, six, seven, eight, nine, ten. Module math has a log function. This function returns the logarithm of a number with a given base. Here are base is 2, because we are dealing with binary. If I ask what the logarithm of 2 is in base 2 it tells me 1. If I ask what the logarithm of 2, of 4 in base 2 it tells me 2. 8 is 3, the logarithm is 16 and base 2 is 4. The logarithm of 32 and base 2 is 5. So let's see if we can get an impression for just how much faster binary search is by looking at the number, well that's 1 million. Let's do 1 billion. So I'm just going to delete these comments, because I'm not allowed to use them when I'm writing an integer. Now linear search on a billion items is going to take a billion iterations. What is the logarithm of 1 billion, in base 2? Just under 30. So, it's going to take 30 iterations of binary search in order to find a number in a billion items. Let's call a binary search and linear search on a long list of numbers, to see just how different these 2 are. We'll search for a value we know is not in the list. So, with 10 million items, binary search is nearly instantaneous. Linear search is noticeably slower, here we go. cProfile is a module that contains code that allows us to profile other code. To profile means to measure in some way. To see how much memory it uses, how much time it takes to run. Function run in module cProfile takes a statement to execute, and then presents profiling information. It uses something called exec, which takes an object, which can be a string or a code object, and then executes it. A code object is something we haven't encountered yet, nor do we need it because we can just use a string. Let's try calling exec. if we're, if we give it something that isn't a string, we're told that argument one must be a string, lights, or code object. Okay let's put 3 plus 4 inside quotes. There's no ever, but we still don't know exactly what's going on. Well, 1 picks a statement, perhaps exec can pick a statement as well, maybe an assignment statement. When we do the function call, indeed variable x is created. Let's try assigning a string to variable y. Remember that l refers to a list with 10 million items in it. Let's call function run in module cProfile with a call on binary search looking in that list for a value that isn't there. We see a table. There were 6 function calls. It took very little time. There was one call that took the total time of no seconds and it was on the string that we passed in. There was one call on binary search, there was one call on exec, one call on len, and one call on a method disable of the profile. That last one is completely irrelevant to our analysis. Here we see that in linear search there was one call on the function linear search. It took nearly 3 seconds to complete. There were ten million and two calls on method len. That took nearly a second just by itself. That gives us a hint that we could optimize linear search. Let's call len on l outside of the y loop, and then use variable length everywhere instead of the call on len. When we run this again and profile it, we see that we went from nearly 3 seconds for a linear search down to just over 1 and a half. That's a remarkable speedup. However, that doesn't even come close to how fast binary search is. This is all if the list is sorted and all in the worst case. In the best case, where we search for the first item in the list, linear search is also blazingly fast. Mathematical analysis of algorithms and profiling can give us significant insight into how algorithms work and why they are so good or bad. This analysis of algorithms is actually a discipline within computer science. It's a sub-field of computer science and people study this kind of thing for a living.