0:00

I just got the number of divide and conquer algorithms and, so far, I've been

Â short shrift to proofs of correctness. This has been a conscience decision on my

Â part. Coming up with the right divide and conquer algorithm for a problem can

Â definitely be difficult, but once you have that eureka moment and you figure out the

Â right algorithm you tend to, also, have a good understanding of why it's correct,

Â why it actually solves the problem on every possible input. Similarly when I

Â present to you a divide and conquer algorithm like, say, merge sort or

Â quicksort, I expect that many of you have a good and accurate intuition about why

Â the algorithm is correct. In contrast the running time of these developed

Â [inaudible] algorithms is often highly non-obvious. So, correctness proofs for

Â divide-and-conquer algorithms tend to simply formalize the intuition that you

Â have via a proof by induction. That's why I haven't been spending much time on them.

Â But nevertheless, I do feel like I owe you at least one rigorous correctness proof

Â for a divide-and-conquer algorithm, and we may as well do it for quicksort. So in

Â this optional video, we'll briefly review proofs by induction, and then we'll show

Â how such a proof can be used to rigorously establish the correctness of quicksort.

Â The correctness proofs for most of the other divide-and-conquer algorithms that

Â we discuss can be formalized in a similar way. So let's begin by reviewing the

Â format for proofs by induction. So, the canonical proofs by induction and the kind

Â that we'll be using here, is when you want to establish an assertion for all of the

Â positive integers in. So now it's some assertion which is parameterized by n,

Â where n is a positive integer. I know this is a little abstract, so let me just be

Â concrete about the assertion that we actually care about for quicksort. So for

Â us, the assertion P(n) is the statement that cor, quicksort is always correct on

Â inputs of length n, arrays that have n elements. So an induction proof has two

Â parts. The first part is a base case and the second part is an inductive step. For

Â the base case you have to get started so you show that at the very least your

Â assertion is true when n equals one. This is often a trivial matter and that'll be

Â the case when we establish the correctness of quick sort. Just on our rays with only

Â one element. So, the non-trivial part of a proof by induction is usually the

Â inductive step. And in the inductive step, you look at a value of n not covered by

Â the base case, so a value of n bigger than one. And you show that if the assertion

Â holds for all smaller values, small integers, then it also holds for the

Â integer n. That is, you show that for every positive integer N that's two or

Â greater, you assume that P of K holds for all K strictly less than N. And under that

Â assumption, which is called the inductive hypothesis. Under the assumption that P of

Â K holds for all K strictly less than N, you then establish that P of N holds as

Â well. So if you manage to complete both of these steps, if you prove both the base

Â case that P(1) holds, you argue that directly, and then also you argue that

Â assuming the inductive hypothesis, that the assertion holds for all smaller

Â integers, it also holds for an arbitrary integer n. Then you're done. Then in fact

Â you have proven that the assertion P then holds for every single positive integer N.

Â Right? So for any given N that you care about, the way you can derive that from

Â one and two is you just start from the base case, P of one holds. Then you apply

Â the inductive step N minus one times. And boom, you've got it. So you know that P

Â holds for the integer N that you care about as well. And that's true for

Â arbitrarily large values of N. So those are proofs by induction in general. Now

Â let's instantiate this proof format, this type of proof for establishing the

Â correctness of quicksort. So let me write again what is the assertion we care about.

Â Our definition of P(n) is gonna be that quicksort is always correct on arrays of

Â length n. And of course what we want to prove is that quicksort is correct no

Â matter what size array that you give it, that is, we want to prove that P(n) holds

Â for every single n at least one. So this is right in the wheelhouse of proofs by

Â induction. ?Kay, so that's how we're going to establish it. Now depending on the

Â order in which you're watching the videos, you may or may not have seen our

Â discussion about how you actually choose the pivot, recall that the first thing

Â Quick Sort does is choose a pivot, then it partitions the array around the pivot. So,

Â we're going establish the correctness of Quick Sort, no matter how the choose pivot

Â sub-routine gets implemented. Okay, so now matter how you choose pivots, you'll

Â always have correctness. As we, as we'll see in a different video, the choice of

Â pivot definitely has an influence on the running of Quick Sort, the correctness of

Â Quick Sort, there's no matter how you choose the pivot. So it's perceived by a

Â proof by induction. So for the base case when n equals one, this is a fairly

Â trivial statement. Right? So, then we're just talking about inputs that have only

Â one element. Every such array is already sorted. Quicksorts, in the bai, when n

Â equals one just returns the input array. It doesn't do anything, and that is indeed

Â the sort of array that it returns. So, by the rather trivial argument we had

Â directly proven that p of one holds. We've proven the rather unimpressive statement

Â that quicksort always correctly sorts one element arrays. Okay? No big deal. So,

Â let's move on to the inductive step. So in the inductive step we have to fix an

Â arbitrary value of N that's at least two. A value of N not covered by the base case.

Â So let's fix some value of N, that leaves two. Now what are we trying to prove?

Â We're trying to prove that Quick Sort always correctly sorts every input array

Â of length N. So we also have to fix an arbitrary such input. So let's make sure

Â we're all clear on what it is we need to show, what do you show in an inductive

Â step. Assuming that PFK holds. For all smaller values, all smaller integers, then

Â P of N holds as well. And remember this is the inductive hypothesis. So in the

Â context of quicksort, we're assuming that quicksort never makes a mistake on any

Â input array that has length strictly smaller than n. And now we just have to

Â show it never makes a mistake on array, input arrays that have size exactly n. So

Â this is the point in the proof where we actually delve into how Quick Sort is

Â implemented to argue correctness. So recall what the first step of Quick Sort

Â is, it picks some pivot arbitrarily, we don't know how, we don't care how. And

Â then it partitions the array around this pivot element p. Now as we argued in the

Â video where we discussed the partition sub routine, at the conclusion of that sub

Â routine, the array has been rearranged into the following format. The pipit is

Â wherever it is, everything to the left of the pipit is less than the pipit, and

Â everything bigger than the pipit is greater than the pipit. Alright, this is

Â where how things stand at the conclusion of the partitioning sub routine. So let's

Â call this stuff less than the pipit the first part of the partition array, and the

Â stuff bigger than the pipit, the second part of the partition array. And recall

Â our observation from the overview video that the pivot winds up in its correct

Â position. Right, where would the pivot be? Where is any element suppose to be in the

Â final sorted array? What's suppose to be to the right of everything less than it,

Â and to the left of everything bigger than it? And that's exactly where this

Â partitioning subroutine deposits the pivot element peak. So now to imply the

Â inductive hypothesis, which you'll recall is a hypothesis about how quick sort

Â operates on smaller sub arrays. Let's call the length of the first part in the second

Â part of the partition [inaudible] K1 and K2 respectively. Now, crucially, both k1

Â and k2 are strictly less than n. Both of these two parts have lengths strictly less

Â than that of the given input array a. That's because the pivot in particular is

Â excluded from both of those two parts. So, their gonna have, at most n minus one

Â [inaudible]. That means that we can apply the inductive hypothesis, which says that

Â the quicksort never makes a mistake on an array that has size strictly less than n.

Â That implies that our two recursive calls to quickstart, the one to the first part

Â and the one to the second part don't make mistakes. They're guaranteed to sort those

Â sub arrays correctly by the inductive hypothesis. And to be very precise, what

Â we're using to argue that the [inaudible] are correct, are P of K1 and P of K2. Or P

Â is the assertion that [inaudible] is always correct on a [inaudible]. K1 and

Â K2. And we know that both of these statements are true because k1 and k2 are

Â less th, are both less than n and because of the inductive hypothesis. So what's the

Â upshot? The upshot is, quicksort's gonna be correct. And so the first recursive

Â call puts all of the elements that are less than the pivot in the correct

Â relative order. Next comes the pivot, which is bigger than all of that stuff in

Â the first part and less than all the stuff in the second part, and then the second

Â recursive call correctly orders all of the elements in the second part. So with those

Â three things pasted together, we have a sorted version of the input array and

Â since this array was an arbitrary one, of link N. That establishes the assertion P

Â of N and since n was arbitrary, that establishes the inductive and completes

Â the proof of correctness of quick sort for an arbitrary method of choosing the pivot

Â element. [sound]

Â