0:00

Now let's turn to the analysis of the deterministic selection algorithm that we

Â discussed in the last slide by Blum, Floyd, Pratt, Rivest, and Tarjan. In

Â particular, let's prove that it runs in linear time on every possible input.

Â Let's, remind you what the algorithm is. So the idea is, we just take the R select

Â algorithm. But instead of choosing a pivot at random, we do quite a bit more work to

Â choose what we hope is going to be a guaranteed pretty good pivot. So again,

Â lines one through three are the new choose pivot subroutine. And it's essentially

Â implementing a two round knockout tournament. So first, we do the first

Â round matches. So what does that mean? That means we take A we think of it as

Â comprising these groups of five elements. So the first five elements one through

Â five and the elements six through ten and points eleven through fifteen and there

Â again and so on. If we sort each of those five using, let's say, merge sort although

Â it doesn't matter much, then the winner in each of these five first round matches is

Â the median of those five. That is the third highest element, third largest

Â element out of the five. So we take those in over five first round winners the

Â middle element of each of the five and the sorted groups, we copy those over into a

Â new array of capital [inaudible] and [inaudible] in it for five. And then we.

Â Second round of our tournament at which we elect the medium of these N over five,

Â first round winners as our final pivot, as our final winner. So, we do that by

Â recursively calling deselect on C. It has a length N over five [inaudible] for the

Â medium. So that's the N over tenth [inaudible] statistic in that array. So,

Â we call the pivot P and then we just proceed exactly like we did. And in the

Â randomized case. That is, we partition A around the pivot, we get a first part, a

Â second part, and we recurs on the left side or the right side as appropriate,

Â depending on whether the pivot is less than or bigger than the element that we're

Â looking for. So the claim is, believe it or not, that this algorithm runs in linear

Â time. Now, you'd be right to be a little skeptical of that claim. Certainly, you

Â should be demanding from me some kind of mathematical argument about this linear

Â time claim. It's not at all clear that that's true. One reason for skepticism is

Â that this is an unusually extravagant algorithm. In two senses for something

Â that's gotta run in linear time. First is, first is it's extravagant use of

Â recursion. There are two different recursive calls, as discussed in the

Â previous video. We have not yet seen any algorithm that makes two recursive calls

Â and runs in linear time. The best case scenario was always [inaudible] for our

Â two recursive call algorithms like merge sort or quick sort. The second reason is

Â that, outside the recursive calls, it seems like it?s just kind of a lot of

Â work, as well. So, to drill down on that point, and get a better understanding for

Â how much work this algorithm is doing, the next quiz asks you to focus just on line

Â one. So when we sort groups of five in the input array how long does that take. So

Â the correct answer to this quiz is the third answer. Maybe you would have guessed

Â that given that I'm claiming that the whole algorithm takes linear time, you

Â could have guessed that this sub-routine is going to be worse than linear time. But

Â you should also be wondering you know, isn't sorting always N log N so, aren't we

Â doing sorting here. Why isn't the N log N thing kicking in? The reason is we're

Â doing something much, much more modest than sorting the linked N input array, all

Â we're sorting are these puny little sub-arrays that have only five elements

Â and that's just not that hard, that can be done in constant time so let me be a

Â little more precise about it. The claim is that sorting an element, an array with

Â five elements takes only some constant number of operations. Let's say 120. Where

Â did this number, 120 come from? Well, you know, for example, suppose we used merge

Â sort. If you go back to those very early lectures, we actually counted up the

Â number of operations that merge sort needs to sort an array length of M. For some

Â generic M, here M is five, so we can just plug five into our previous formula that

Â we computed from merge sort. Right if we plug amicle five into this formula, what

Â do we get, we get six times five times log base 205+1. Who knows what log base 205

Â is, that's some weird member but it's gonna be a most three right. So that's the

Â most three of three+1 is four multiply that by five and again time six and will

Â get you 120. So it's constant time to sort just one of these groups of five. Now of

Â course, we have to do a bunch of groups of five because there's only a linear number

Â of groups. Constant for each, so it's gonna be linear time overall. So to be

Â really pedantic. We do 120 operations at most per group. There's N over five

Â different groups. We multiply those, we get 24 N operations. So do all the sorting

Â and that's obviously a big O event. So linear time for step one. So having warmed

Â up with step one. Let's look now at the whole seven line algorithm, and see what's

Â going on. Now I hope you haven't forgotten the paradigm that we discussed for

Â analyzing the running time of deterministic divide and conquer

Â algorithms like this one. So namely we're gonna develop a recurrence and remember a

Â recurrence expresses the running time, the number of operations performed, in two

Â parts. First of all, there's the work done by the recursive calls on smaller

Â sub-problems. And secondly, there's the work done locally, not in the recursive

Â calls. So let's just go through these lines one at a time, and just do a running

Â tally of how much work is done by this algorithm, both locally and by the

Â recursive calls. So the quiz was about, step number one. We just argued that since

Â it's constant time to sort each group, and there's a linear number of groups, we're

Â gonna do linear work, theta of N. For step one. So copying these first round winners

Â over in to their special array C is obviously linear time. Now, when we get to

Â the third line, we have a recursive call, but it's a quite easy recursive call to

Â understand. It's just, recursing on a, a ray that has size twenty percent as large

Â as the one we started with, on the N over five elements. So this, remember the

Â notation we used for recurrences. Generally, we denote by capital T the

Â running time of an algorithm on [inaudible] of a given length. So this is

Â going to be the running time that our algorithm has in the worst case on inputs

Â of length N over five. Cuz N over five is the length of the array that we're passing

Â to this recursive call. Good. Step four, partition. Well we had. Videos about how

Â they were going to partition the Y to linear time. We knew that all the way back

Â from quick sort, so that's definitely Theta of N. Step five is constant time,

Â I'm not going to worry about it. And finally we get to lines six and seven so

Â at most one of these will execute so in either case there's one recursive call. So

Â that's fine, we know in recurrences when there's recursive call we'll just write

Â capital T of whatever the input length is. So we just have to figure out what the

Â input length here is. It was N over five in step, in line three so we just have to

Â figure out what it is in line six or seven. Oh yeah, now we're remembering why

Â we didn't use recurrences when we discussed randomized quick sort and. The

Â randomized selection algorithm. It's because we don't actually know how big the

Â recursive call is, how big the input passed to this recursive call in line six

Â or seven is. Line three, no problem. It's guaranteed to be twenty percent of the

Â input array cuz that's how we defined it. But for line six or seven, the size of the

Â input array that gets passed to the, to the recursive call depends on how good the

Â pivot is. It depends on the splitting of the array A into two parts, which depends

Â on the choice of the pivot P. So at the moment all we can write is T. Of question

Â mark. We don't know. We don't know how much work gets done in that recursion,

Â cause we don't know what the input size is. Let me summarize the results of this

Â discussion. So write down a recurrence for the D select algorithms. So with T of N

Â denote the maximum number of operations the D select ever requires to terminate an

Â array of input [inaudible]. It's just the usual definition of T of N when using

Â recurrences. What we established in our tally on the last slide is that deselects

Â does linear stuff outside the recursive calls. It does the sorting of groups of

Â five. It does the copying, and it does the partitioning. Each of those is linear, so

Â all of them together is also linear. And then it does two recursive calls. One

Â whose size we understand, one whose size we don't understand. So, for once I'm not

Â going to be sloppy and I'm going to write out an explicit constant about the work

Â done outside of the recursive cause. I'm going to write [inaudible], I'm going to

Â actually write C times N for some constant C. So of course no one ever cares about

Â base cases, but for completing this let me write it down anyways. When D select gets

Â an input of only one element it returns it, what's called that one operation for

Â simplicity. And then in the generals cases and this is what's interesting. When

Â you're not in the base case and you have to recurs, what happens? Well you do a

Â linear work outside of the recursive call. So that's C times N for some constant C. C

Â is just the [inaudible] constant on all of our big thetas on the previous slide. Plus

Â the recursive call in line three, and we know that happens on an array of size

Â [inaudible] five. As usual, I'm not gonna worry about rounding up or rounding down,

Â it doesn't matter. Plus our mystery recursive call on an array of unknown

Â size. So that's where we stand and we seem stuck because of this pesky question-mark.

Â So, let's prove lemma which is gonna replace this question-mark with something

Â we can reason with, with an actual number that we can then analyze. So the upshot of

Â this key lemma is that all of our hard work in our choose pivot subroutine in

Â lines one through three bears fruit in the sense that we're guaranteed to have a

Â pretty good pivot. It may not be the median, it may not give us a 50/50 split.

Â Then we could replace the question mark with, one-half times N. But it's gonna let

Â us replace the question mark by seven-tenths times N. Now, I don't wanna

Â lie to you, I'm gonna be honest, it's not quite 7/10N, it's more like 7/10N minus

Â five, there's a little bit of additive error, so, taking care of the additive

Â error adds nothing to your conceptual understanding of this algorithm or why it

Â works. For those of you who want a truly rigorous proof, there are some posted

Â lecture notes which go through all the gory details. But in lecture I'm just

Â gonna tell you what's sort of morally true and ignore the fact that we're gonna be

Â off by three here and four there. And then we'll be clear when I show you the proof

Â of this limit, where I'm being a little bit sloppy and why it really shouldn't

Â matter, and it doesn't. So to explain why this key limit is true why we get a 30 70

Â split or better guaranteed, let me set up a little notation. I'm getting sick of

Â writing N over five over and over again, so let's just give that a synonym, let's

Â say, K. So this is the number of different sort of first round matches that we have,

Â the number of groups. I also want some notation to talk about the first round

Â winners, that is the medians of these groups of five, the K first round winners.

Â So, were gonna call XI the [inaudible] smallest of those who win their first

Â round match and make it to the second round. So just to make sure the notation

Â is clear, we can express the pivot element in terms of these X?s. Remember, the pivot

Â is the final winner. It wins not only its first round tournament, but it also the

Â second round tournament. It's not only the middle element of the first group of five.

Â It's actually the median of the N over five middle element. It's the median of

Â the medians. That is, of the K middle elements, it's the K over two order

Â statistic, [inaudible] K over two smallest. I'm saying this, assuming that K

Â is even. If K was odd, it would be some slightly different formula as you know. So

Â let's remember what we're trying to prove. We're trying to prove that for our

Â proposed pivot, which is exactly this element X sub K over two, it's exactly the

Â winner of this 2-round knockout tournament. We're trying to argue that for

Â this proposed pivot, we definitely get a 30-70 split or better. So what that means

Â is, there better be at least 30 percent of the elements that are bigger than the

Â pivot. That way if you recurs on the left side on the first part, we don't have to

Â deal with more that more than 70 percent of the original elements. Similarly, there

Â better be at least 30 percent of the elements that are smaller than the pivot.

Â That way if we recurs on the right hand side we know we don't have to deal with

Â more than 70 percent of the original input elements. So if we achieve this goal, we

Â prove that there's at least 30 percent on each side of XK over two, then we're done.

Â That proves the key lemma that would get a 30/70 split or better. So I'm gonna show

Â you why this goal is true. I'm gonna introduce a thought experiment. And I'm

Â gonna lay out it abstractly. Then we'll sorta do an example to make it more clear.

Â And then we'll go back to the general discussion and finish the proof. So what

Â we're gonna do is a thought experiment, for the purposes of counting how many

Â elements of the input array are bigger than our pivot choice, and how many are

Â smaller. So in our minds we're going to imagine that we're taking elements in A

Â and rearrange them in a 2D grid. So here are the semantics of this grid. Each

Â column will have exactly five elements that will correspond to one of the groups

Â of five. So we'll have N over five columns corresponding to our N over five groups in

Â our first round of our tournament. [inaudible] is not a multiple of five then

Â one of these groups has size between one and four but I'm just not gonna worry

Â about it, that some of the additive loss, which I'm ignoring. Moreover were going to

Â arrange each column in a certain way so that going from bottom to top the entries

Â of that go from smallest to largest. So this means that in this grid we have five

Â rows. And the middle row, the third row, corresponds exactly to the middle

Â elements, to the winners of the first round matches. So because these middle

Â elements these first round winners are treated specially, I'm going to denote

Â them with big squares, the other four elements of the group two of which are

Â smaller two of which are bigger are just going to be little circles. Furthermore,

Â in this thought experiment, in our mind, we're going to arrange the columns from

Â left to right in order of increasing value of the middle element. Now remember, I

Â introduced this notation X of I is the [inaudible] smallest amongst the middle

Â elements. So a different way of what I'm trying to say is that the leftmost column

Â is the group that has X1 as its middle element. So among the N over five middle

Â elements, one of the groups has the smallest middle elements. We put that all

Â the way on the left. So this is gonna be X1 in the first column, the smallest of

Â the first round winners. X2 is the second smallest of the first round winners, X3 is

Â the third smallest and so on. At some point we get to the median of the first

Â round winners, XK over two. And then, way at the rights is the largest of the first

Â round winners. And I'm sure that you remember that the median of medians which

Â is XK over two is exactly our pivot. So this is our lucky winner. I know this is a

Â lot to absorb, so I'm gonna go ahead and go through an example. If what I've said

Â so far makes perfect sense, you should feel free to skip the following example.

Â But if there's still some details you're wondering about, and hoping this example

Â will make everything crystal clear. So let's suppose we have an input array. I

Â need a, a slightly big one to [inaudible] grid make sense. Let's say there's an

Â input array of twenty elements. So there's going to be the input array, which is in a

Â totally arbitrary order. There's gonna be the vertical [inaudible] after we sort

Â each group of five. And then I'm gonna show you the grid. So this is the input

Â we're all gonna use. Let's now go ahead and delineate the various groups of five.

Â