0:01

So let's review the story so far.

Â We've been discussing the QuickSort algorithm.

Â Here again is its high level description.

Â So in QuickSort you call two subroutines first, and

Â then you make two recursive calls.

Â So the first subroutine ChoosePivot, we haven't discussed yet at all.

Â That'll be one of the main topics of this video.

Â But the job of the ChoosePivot subroutine is to somehow select

Â one of the n elements in the input array, to act as a pivot element.

Â Now what does it mean to be a pivot?

Â Well that comes into play in the second subroutine,

Â the partition subroutine, which we did discuss quite a bit in a previous video.

Â So what a partition does is it rearranges the elements in the input array, so that

Â it has the following property, so that the pivot p winds up in its rightful position.

Â That is, it's to the right of all of the elements less than it, and

Â it's to the left of all of the elements bigger than it.

Â The stuff less than it's to the left in some jumbled order.

Â The stuff bigger than it's to the right in some jumbled order.

Â That's what's listed here as the first part and

Â the second part of the partitioned array.

Â Now, once you've done this partitioning, you're good to go.

Â You just recursively sort the first part to get them in the right order,

Â you call QuickSort again to recursively sort the right part, and

Â bingo, the entire array is sorted.

Â You don't need a combine step, you don't need a merge step.

Â Where we'll recall in a previous video,

Â we saw that the partition array can be implemented in linear time.

Â And moreover, it works in place with essentially no additional storage.

Â We also, in an optional video, formally proved the correctness of QuickSort, and

Â remember QuickSort is independent of how you implement the ChoosePivot subroutine.

Â So what we're going to do now is discuss the running time of the QuickSort

Â algorithm, and this is where the choice of the pivot is very important.

Â 1:52

The key point to realize at this juncture, is that we are not currently in a position

Â to discuss the running time of the QuickSort algorithm.

Â The reason is we do not have enough information.

Â The running time of QuickSort depends crucially on how you choose the pivot.

Â It depends crucially on the quality of the pivot chosen.

Â 2:15

You'd be right to wonder what I mean by a pivot's quality.

Â And basically what I mean, is a pivot is good

Â if it splits the partitioned array into roughly two equal sized subproblems.

Â And it's bad, it's of low quality, if we get very unbalanced subproblems.

Â So to understand both, what I mean, and the ramifications of having good

Â quality and bad quality pivots, let's walk through a couple of quiz questions.

Â 2:39

This first quiz question is meant to explore a sort of worst case execution of

Â the QuickSort algorithm.

Â What happens when you choose pivots that are very poorly suited for

Â the particular input array?

Â Let me be more specific.

Â Suppose we use the most naive ChoosePivot implementation,

Â like we were discussing in the partition video.

Â So remember, here we just pluck out the first element of the array and

Â we use that as the pivot.

Â So suppose that's how we implement the ChoosePivot subroutine, and

Â moreover, suppose that the input array to QuickSort

Â is an array that's already in sorted order.

Â So for example, if it just had the numbers one through eight, it would be one, two,

Â three, four, five, six, seven, eight, in order.

Â My question for you is, what is the running time

Â of this recursive QuickSort algorithm on an already sorted array,

Â if we always use the first element of a subarray as the pivot?

Â 3:34

Okay, so this is a slightly tricky, but actually a very important question.

Â So the answer is the fourth one.

Â So it turns out, that QuickSort, if you pass it an already sorted array and

Â you're using the first element as pivot elements, it runs in quadratic time.

Â And remember for a sorting algorithm, quadratic is bad.

Â It's bad in the sense that we can do better.

Â MergeSort runs in time n log n, which is much better than n squared.

Â And if we we're happy with an n squared running time,

Â we wouldn't have to resort to these sort of relatively exotic sorting algorithms.

Â We could just use Insertion sort, and we'd be fine.

Â We'd get that same quadratic running time.

Â Okay, so now I owe you an explanation.

Â Why is it that QuickSort can actually run in quadratic time, in this unlucky case,

Â of being passed an already sorted input array?

Â 4:20

Well to understand, let's think about what pivot gets chosen, and

Â what are the ramifications of that pivot choice for how the array gets partitioned,

Â and then what the recursion looks like.

Â So, let's just think of the array as being the numbers 1 through n, in sorted order.

Â What is going to be our pivot?

Â Well, by definition we're choosing the first element of the pivot, so

Â the pivot's just going to be 1.

Â Now we're going to invoke the partition subroutine.

Â And if you go back to the Pseudocode of the partition subroutine, you'll notice

Â that if we pass an already sorted array, it's going to do essentially nothing.

Â Okay? So it's just going to advance the index j,

Â until it falls off the end of the array, and it's just going to return back to us,

Â the same array that it was passed as input.

Â So partition subroutine,

Â if given an already sorted array, returns an already sorted array.

Â So we have just a pivot 1, in the first position.

Â And then the numbers 2 through n, in order, in the remainder of the positions.

Â 5:11

So if we draw our usual picture of what a partitioned array looks like,

Â with everything less than the pivot to the left,

Â everything bigger than the pivot to the right.

Â Well, since nothing is less than the pivot, this stuff is going to be empty.

Â This will not exist.

Â And to the right of the pivot, this will have length n- 1, and

Â moreover, it will still be sorted.

Â 5:37

So once partition completes,

Â we go back to the outer call of QuickSort, which then calls itself recursively twice.

Â Now in this case, one of the recursive calls is just vacuous.

Â There's just an empty array, there's nothing to do.

Â So really there's only one recursive call,

Â and that happens on a problem of size only one less.

Â So this is about the most unbalanced split we could possibly see, right,

Â where one side has 0 elements, one side's n- 1.

Â Splits don't really get any worse than that.

Â And this is going to keep happening over, and over, and over again.

Â We're going to recurse on the numbers 2 through n.

Â We're going to choose the first element, the 2, as the pivot.

Â Again, we'll feed it to partition.

Â We'll get back the exact same subarray that we handed it in.

Â We get to the numbers 2 through n, in sorted order.

Â We exclude the pivot 2,

Â we recurse on the numbers 3 through n, a subarray of length n- 2.

Â The next recursion level, we recurse on an array of size of length n- 3,

Â then n- 4, then n- 5, and so on.

Â Until finally, after I did recursion depth of n,

Â roughly, we got down to just the last element n, the base case kicks in, and

Â we return that, and QuickSort completes.

Â So that's how QuickSort is going to execute on this particular input with

Â these particular pivot choices, so what running time does that give to us?

Â 6:50

Well, the first observation is that in each recursive call,

Â we do have to invoke the partition subroutine.

Â And the partition subroutine does look at every element in the array it has passed

Â as input.

Â So if we pass partition in array of length k, it's going to do at least k operations,

Â because it looks at each element at least once.

Â So the runtime is going to be bounded below by the work we do in the outermost

Â call, which is on an array of length n, plus the amount we do in the second

Â level of recursion, which is on a subarray of length (n- 1) + (n- 2) +, blah, blah,

Â blah, blah, blah, all the way down to + 1, for the very last level of the recursion.

Â So this is a lower bound on our running time,

Â and this is already Theta of n squared.

Â 7:36

So, one easy way to see why this sum n + (n- 1) +, etc., etc.,

Â leads to a bound of n squared, is to just focus on the first half of the terms.

Â So, the first n over two terms in the sum are all of magnitude at least n over 2, so

Â the sum is at least n squared over 4.

Â It's also evident that this sum is at most, n squared.

Â So, overall,

Â the running time of QuickSort on this bad input is going to be quadratic.

Â Now having understood what the worst case performance for

Â the QuickSort algorithm is, lets move on to discuss it's best case running time.

Â Now we don't generally care about the best case performance of algorithms for

Â it's own sake.

Â The reason that we want to think about QuickSort in the best case,

Â first of all it'll give us better intuition for how the algorithm works.

Â Second of all, it'll draw a line in the sand.

Â Its average case running time certainly can't be better than the best case, so

Â this will give us a target for what we're shooting for

Â in our subsequent mathematical analysis.

Â 8:31

So what were the best case?

Â What was the highest quality pivot we could hope for?

Â Well again, we think of the quality of the pivot as the amount of

Â balance that it provides between the two sub problems.

Â So ideally, we choose a pivot which gave us two sub-problems,

Â both of size n over 2 or less.

Â And there's a name for

Â the element that would give us that perfectly balanced split.

Â It's the median element of the array, okay, the element where exactly half of

Â the elements are less than it and half of the elements are bigger than it.

Â That would give us an essentially perfect 50, 50 split of the input array.

Â So, here's the question.

Â Suppose we had some input and we ranked QuickSort, and

Â everything just worked in our favor, in the magically, in the best possible way.

Â That is, in every single recursive invocation of QuickSort,

Â on any sub array of the original input array.

Â Suppose, we happen to get, as our pivot the median element.

Â That is, suppose in every single recursive call.

Â We wind up getting a perfect 50/50 split of the input array before we recurse.

Â This question asks you to analyze the running time of this algorithm in

Â this magical best case scenario.

Â 9:59

That is the running time QuickSort requires in this

Â magical special case on a array of length n.

Â As usual, you have a recurrence in two parts.

Â There's the work that gets done by the recursive cause and

Â there's the work that gets done now.

Â Now by assumption, we wind up picking the median as the pivot.

Â So there's going to be two recursive calls,

Â each of which will be on an input of size at most N over two.

Â And, we can write this, this is because the pivot equals the median.

Â 10:34

So that's what gets done by the two recursive calls.

Â And then how much work do we do outside of the recursive calls?

Â Well, we have to do the truth pivot subroutine.

Â And I guess, strictly speaking, I haven't said how that was implemented.

Â But let's assume that choose pivot does only a linear amount of work.

Â And then, as we've seen,

Â the partition subroutine only does a linear amount of work, as well.

Â So let's say O(n), for work outside of the recursive calls.

Â 10:59

And what do we know?

Â We know this implies, say by using the master method, or

Â just by using the exact same argument as for Merge Sort, this gives us a running

Â time balunt of (nlogn) And again something I haven't really been emphasizing but

Â which is true is that actually we can write theta (n log n).

Â And that's because in the recurrence, in fact, we know that the work done outside

Â of the recursive calls is exactly theta (n), okay?

Â Partition needs really linear time, not just O(n) time.

Â In fact the work done outside of the recursive calls is theta (n).

Â That's because the partition serpentine does indeed look at

Â every entry in the array that it passed, all right, and as a result,

Â we didn't really discuss this so much in the master method.

Â But as I mentioned in passing, if you have recurrences which are tight in this sense,

Â then the result of the master method

Â can also be strengthened to be theta instead of just beta.

Â But those are just some extra details.

Â The upshot of this quiz is that even in the best case, even if we magically get

Â prefect pivots throughout the entire trajectory of quick sort.

Â The best we can hope for is an n log n upper bound,

Â it's not going to get any better than that.

Â So the question is how do we have a principled way of choosing pivots so that

Â we get this best case or something like it that's best case n log n running time.

Â So that's what the problem that we have to solve next.

Â So the last couple quizzes have identified a super important question, as far as

Â the implementation of Quicksort, which is how are we going to choose these pivots?

Â We now know that they have a big influence on the running time of our algorithm.

Â It could be as bad as n squared or

Â as good as m log n, and we really want to be on the m log n side.

Â So the key question: how to choose pivots.

Â 13:05

By which I mean for every time we recursively call quick sort and

Â we are pass some subarray of length k.

Â Among the K candidate pivot elements in the sub-array,

Â we're going to choose each one with probability of one over k.

Â And we're going to make a new random choice every time we have qa recursive

Â call, and we're going to see how the algorithm does.

Â This is our first example of a randomized algorithm.

Â This is an algorithm where, if you feed it exactly the same input,

Â it'll actually run differently, on different execution.

Â And that's because there's randomness internal to the code of the algorithm.

Â Now, it's not necessarily intuitive.

Â The randomization should have any purpose in the computation, in software design and

Â algorithm design.

Â But, in fact, and this has been sort of one of the real breakthroughs in algorithm

Â design, mostly in the '70s, in realizing how important this is,

Â that the use of randomization can make algorithms more elegant.

Â Simpler, easier to code, faster, or just simply you can solve

Â problems that you could not solve as easily without the use of randomization.

Â It's really one thing that should be in your toolbox as an algorithm designer,

Â randomization.

Â Quick Sort will be the first [INAUDIBLE] application, but

Â we'll see a couple more later in the course.

Â 14:18

Now by the end of this sequence of video's I'll have given you a complete rigorous

Â argument about why this works.

Â Why with random pivots, quick sort always runs very quickly, on average.

Â But, you know, before moving into anything to formal let's develop a little bit of

Â intuition or at least kind of a daydream.

Â About why on Earth could this possibly work, why on Earth could this possibly be

Â a good idea, to have randomness internal to our pro sort implementation.

Â Well, so first just very high level, what would be sort of the hope, or the dream.

Â The hope would be, random pivot's are not going to be perfect,

Â I mean you're not going to just sort of guess the median, or

Â you only have a one in chance of figuring out which one the median is,

Â but the hope is that most choices of a pivot will be good enough.

Â 15:12

The first claim is that, you know in our last quiz we said suppose we get lucky and

Â we always pick the median in every single recursive call.

Â And we observed we'd do great.

Â We'd get end log in running time.

Â So now let's observe that actually to get the end log in running time,

Â it's not important that we magically get the median every single recursive call.

Â If we get any kind of reasonable pivot, by which a pivot that gives us some kind of

Â approximately balanced with the problems, again, we're going to be good.

Â So the last quiz really wasn't particular to getting the exact median.

Â Near medians are also fine.

Â To be concrete, suppose we always pick a pivot which guarantees us a split

Â of 25 to 75, or better.

Â That is, both recursive calls should be called on arrays of size,

Â at most, 75% of the one we started with.

Â So precisely if we always get a 25, 75 split or better in every recursive call I

Â claim that the running time of quick sort in that event will be big O of n log n.

Â Just as it was in the last quiz where we're actually assuming something much

Â stronger where we're getting a median.

Â Now this is not so obvious, the fact that 25, 75 splits guarantee analog and

Â running time.

Â For those of you that are feeling keen you might want to try to prove this.

Â You can prove this using a recursion tree argument, but

Â because you don't have balanced sub problems you have to work a little bit

Â harder than you do in the cases covered by the master method.

Â 16:41

The second part of the intuition is to realize that actually we don't have to

Â get all that lucky to just be getting a 25, 75 split.

Â That's actually a pretty modest goal and

Â even this modest goal is enough to get n log n running time, right?

Â So suppose our array contains the numbers, the integers between 1 and

Â 100, so it is an array of length 100.

Â Think for a second, which of those elements is going to give us

Â a split that's 25, 75 or better?

Â So, if we pick any element between 26 and

Â 75 inclusive, will be totally good, right?

Â If we pick something that's at least 26,

Â that means the left subproblem is going to have at least the elements 1 through 25.

Â That'll have at least 25% of the elements.

Â If we pick something less than 75 then the right sub-problem will have all of

Â the elements 76 through 100 after we partition, so

Â that'll also have at least 25% of the elements.

Â So anything between 26 and 75 gives us a 75-25 split or better.

Â 17:39

But that's a full half of the elements, so

Â it's as good as just flipping a fair coin hoping to get heads.

Â So with 50% probability,

Â we get a split that's good enough to get this n log n bound.

Â And so again, the high level hope is that often enough, half of the time,

Â we get these good enough splits, 25-75 split or better, so

Â that would seem to suggest n log n running time on average is a legitimate hope.

Â 18:07

So that's the high level intuition, but if I were you I would certainly not

Â be content with this somewhat hand wavy explanation that I've given you so far.

Â What I've told you is sort of the hope the dream,

Â why there is at least a chance this might work.

Â But the question remains, and

Â I would encourage such skepticism, which is does this really work?

Â 18:27

And to answer that we're going to have to do some actual mathematical analysis, and

Â that's what I'm going to show you.

Â I'm going to show you a complete rigorous analysis of the quick sort algorithm with

Â random pivots, and we'll show that yes in fact, it does really work.

Â And this highlights what's going to be a recurring them of this course, and

Â a recurring theme in the study and understanding of algorithms.

Â Which is that quite often there's some fundamental problem you're trying to code

Â with a solution, you come up with a novel idea, it might be brilliant, and

Â it might suck.

Â You have no idea.

Â Now, obviously, you code up the idea, run it on some concrete instances and

Â get a feel for whether it seems like a good idea or not.

Â But if you really want to know fundamentally what makes the idea good or

Â what makes the idea bad, really,

Â you need to turn to mathematical analysis to give you a complete explanation.

Â And that's exactly what we're going to do with QuickSort,

Â and then we'll explain in a very deep way why it works so well.

Â 19:25

So under no assumptions about the data, that is, for

Â every input array of a given length, say n, the average running

Â time of QuickSort implemented with random pivots is big O of n log n.

Â And again in fact it's theta of n log n but

Â we'll just focus on the big O of n log n part.

Â 19:45

So this is a very, very cool theorem about this randomized QuickSort algorithm.

Â One thing I want to be clear so that you don't under sell this guarantee in

Â your own mind, this is a worst case guarantee with respect to the input.

Â Okay so notice at the beginning of this theorem what do we say?

Â For every input array of length n, all right?

Â So, we have absolutely no assumptions about the data.

Â This is a totally general purpose sorting separating which you can

Â use whatever you want even if you have no idea where the data is coming from.

Â And these guarantees are still going to be true.

Â 20:20

This of course is something I held forth about at some length

Â back in our guiding principles video, when I argued that if you can get away with it,

Â what you really want is general purpose algorithms.

Â Which make no data assumption, so they can be used over and

Â over again in all kinds of different contexts and

Â that still have great guarantees, QuickSort is one of those.

Â So basically if you have a data set and it fits in the main memory of your machine,

Â again sorting is a four free sub routine in particular QuickSort.

Â The QuickSort implementation is for free.

Â So this just runs so blazingly fast, doesn't matter what the array is,

Â maybe you don't even know why you want to sort it.

Â But go ahead, why not?

Â Maybe it will make your life easier, like it did for example

Â in the closest pair algorithm for those of you who watch those two optional videos.

Â 21:04

Now the word average does appear in this theorem and

Â as I've been harping on, this average is not over any assumptions on the data.

Â We're certainly not assuming the the input array is random in any sense.

Â The input array can be anything, so where is the average then coming from?

Â The averaging is coming only from randomness which is internal to

Â our algorithm.

Â Randomness that we put in the code in ourselves, that we're responsible for.

Â 21:28

So remember, randomized algorithms have the interesting property that even if you

Â run it on the same input over and over again,

Â you're going to get different executions.

Â So the running time of a randomized algorithm can vary

Â as you run it on the same input over and over again.

Â The quizzes have taught us that the running time of QuickSort on a given input

Â fluctuates from anywhere between the best case of n log

Â n to the worst case of n squared.

Â So what this theorem is telling us is that for every possible input array,

Â while the running time does indeed fluctuate between an upper

Â bound of n squared and a lower bound of n log n.

Â The best case is dominating.

Â On average it's n log n, on average it's almost as good as the best case.

Â That's what's so amazing about QuickSort.

Â Those n squared that can pop up once in a while, doesn't matter.

Â You're never going to see it,

Â you're always going to see this n log n like behavior in randomized QuickSort.

Â So for some of you I'll see you next in a video on probability review,

Â that's optional.

Â For the rest of you I'll see you in the analysis of this theorem.

Â