Today, we're going to talk about recursion; when a function calls itself. It's a little bit mind-bending to think about first, but it's actually a powerful and useful programming technique. Let's begin by talking about some mathematical foundations of recursion. So, first of all, just what is recursion? Well, as I mentioned, it's when something is specified in terms of itself. Here's an example of a recursive image. So, why should we learn recursion? Well, it represents a new way of thinking that's really fundamental in Computer Science. But not only that, as I mentioned, it provides a powerful programming paradigm. It helps us reason about computation, about the correctness of our programs, and it really gives some really deep insights into the nature of computation. You'll see some examples of that in today's lecture. There's a lot of computational artifacts that are just naturally self-referential. This is an example of a tree where a tree is defined in terms of a node, and its got subtrees which are also trees. So, you're used to using a file system where you have folders, and the folders contain folders. The concept of a folder refers to itself. We'll see fractal graphical patterns and we've seen some of those before, and then there is so-called divide and conquer algorithms where we solve problems by breaking them into smaller parts and using recursion. So, just as a simple starting example, we're going look at a recursive program to convert an integer to binary. So, the idea behind a recursive program is, so you want a function of a positive integer N. We have two components. So, the first is what's called the "Base case" where if N is small, we just return a value. The second one is called the "Reduction step." If we assume that the function works for smaller values of its argument, figure out a way to use the function to compute the return value for N argument that we have. Here's an example of this integer binary conversion program. So, this is going to be a static method or function that returns a string and takes an integer N as argument, and we'll assume that that's a positive integer and bigger or equal to 1. So, if N = 1, then we just return the string 1. So, that's the base case. Otherwise, the way we can solve this problem is to use the function to convert the integer N/2 to a string and then add on to that the last bit. Now, this automatically gets converted to a string because that pluses and concatenation operator between two strings in Java. So, we convert the N/2 to a string, and then we append the last bit. That's it, that's the whole program. So, if we take a main that takes an integer from the command line and then call convert N, that's our function, and very simple function for converting an integer from decimal of minor decimal that we typed into binary or whatever internal representation the computer wants to use. So, that's our example. We'll look at this one in more detail in just a second. So, one of the interesting things about this is just convincing ourselves that the method is correct. That's really what recursion buys us, is a reasonable way to convince ourself that the method is correct. The technique that we use is called "Mathematical induction," and that's the technique for mathematical proofs that parallels our recursion. So, let's look in more detail. You learned mathematical induction somewhere in school. We'll quickly review it. So, to prove a statement involving an integer N, first, you prove a base case. You prove it for some specific values of N. Then the induction step is to assume that it's all true for all integers less than N and use that fact to prove it for N. So, here's a simple example. Some of the first odd integers is N squared. So, that's true for N = 1 and 1 = 1. So, now we assume that we'll do the induction step. The nth odd integer is 2N-1. So, we're going to let that sum of the first N integers be this variable TsubN, 1 + 3 + 5, the first N odd integers. So, we're going to assume that TsubN - 1 = N - 1 squared. If we do that and then we add 2N - 1, then it's just a little simple algebra to prove that it's N squared. That's mathematical induction to show that the sum of first N odd integers is N squared. Now here's an alternate graphic proof if you still don't believe it. But really, the point is for you to check this idea of proving a statement by induction. Prove it for the base case, and then in the induction step, assume it's true for small N, and then use that to prove it for N. So, that's how we can prove a recursive program correct. So, that's what we laid out for recursion, and this is what we layout for induction. They're entirely parallel, intentionally. So, if we have a recursive program like our program that's supposed to convert an integer N to binary string that represents it, we can do the correctness proof by induction. So, we want to prove that convert computes the binary representation of N. To do that, it works for N = 1 and returns the string 1. Then the induction step is if N is even and you add a 0 to the end of it, N is 2(N/2). So, that's going to work. If it works for N/2, it's going to work for N. If it's odd, you add a 1. Again, if it works for N/2, it's going to work for N. That's a proof that this programs works. Now we're not going to always do formal induction proofs, but we will informally think about convincing ourselves that recursive programs work in this way. Now, what actually happens when we make a recursive call? That's also worth thinking about. Now we're not going to go into this in detail, but still, the idea is that in modern programming systems, any function call involves certain mechanics. In a sense, because those mechanics exist, we really get recursion for free. So let's look at that in a little more detail. So, any time a function is called, the system has to save the values of all variables and where the call came from somewhere. That's what we assumed to happen when a function gets called. When we get back from the function, everything should be the same. Then, it has to initialize the values of the argument variables, transfer control to the function, and then when the function is done, it has to restore all that environment and assign the return value where the function call was and then transfer control back to the calling code. Because systems do these operations for any function call, it doesn't matter if the function calls itself. So, let's look at what happens with convert. We call it with 26. So, calling convert with an argument of 26 results in calling itself with an argument of 13. So, that's a call for 13. When 13 remembers, it has to remember to substitute the string that it gets for that call of 13 and then append the zero. That's the way that the function call mechanism works. It doesn't really matter at each level that the function is calling itself. So, 13 calls 6, 6 calls 3, 3 calls 1. Then 1 finally returns to the 1. What's that mean? That means where the call convert 1 came, we put the value 1. So, now we're going to return 1 + 1, so that's the binary representation of 3, which is just 1, 1, and that's what we put for convert three. So 6, we concatenate those two strings, replace the call to convert 6 with that, and we get the minor representation of 6 there. Then we append to 1. That's the minor station of 13, and then finally put that together to get the binary representation of 26 which gets printed out. We get recursion for free in modern systems. Now, there's plenty of possible bugs with recursion that it's worthwhile to point out right at the beginning. Debugging recursive programs can be a bit of a challenge. There really are loops, and we'll talk about that some more. So, one bad thing to do would be to not have a base case. So, a bad of N calls bad of N - one. What's going to happen here is that it's just going to keep calling itself, we'll talk about that what happens just a second. Or a similar thing is if you maybe you have a base case, but you didn't really check your program fully to make sure that it has a convergence guarantee. So, in this case, if you call this function with N = 2, then it's going to call itself with N = 2 again, and get in the loop calling itself again. This happens with induction too, but with recursion the computers checking what we do. Both of these cases lead to infinite recursive loops, and that's bad things, bad news. In old systems it would really create havoc, and it would be difficult to stop the computer from dealing with this, and so you have to check it on your system just like when we looked with what happens when you have an infinite loop, and a for, or while, you get an infinite loop with situations like this, and you better know how to stop it. Nowadays, you don't have to pull the plug, maybe control C will do the job for you. We make that requirement that we have to have a convergence guarantee, just so we can convince ourselves that it stops. It's not so clear always what goes on. This is a classic example called the "Collapse sequence." This is very simply defined integer sequence. It's a function of N. If N is 1, stop. If N is even, divide by 2. If N is odd, multiply by 3 and add 1. That's what the sequence looks like. So, one thing about this is that it's not always getting smaller. If you multiply by 3 and add 1, that's definitely not getting smaller. Now, you could write a recursive program for printing out the collapse sequence. If N = 1, you're done. If it's even, you call it for N/2, and otherwise you call it for 3N + 1. So, it's a fine recursive programs. Now the question is, can we convince ourselves whether it stops or not? Do it for 7 and you get that sequence. The amazing fact is that actually, nobody knows whether this function terminates for all values of N. A lot of people have spent a lot of time. Mathematicians trying to understand whether or not this thing always terminates. So, simple program with recursion can lead to behavior that nobody even knows what it is. Usually, what we do is ensure termination by only calling for small N, and we're certainly going to do that for all the recursive programs that we write in this course. That's the Internet cartoon now related to the Collatz Conjecture. So, eventually, your friends will stop calling to see if you want to hang out. You can try it for a relatively small value and get pretty long sequence.