Let's continue our discussion of functions. Next, I'm going to tell you about a kind of function that returns to the simple true or false value. We can return anything we like from a function. We can return strings or numbers. But sometimes we just want to know if something is true or not. We call that a Boolean function. So that's the kind of function you might want to use, say, in the beginning of an if statement to test if something is true. You might want to just call a function to test it. So as an example we're going to write a short program that checks to see if a DNA sequence contains an in-frame stop codon. So what does that mean? So if If you know your genetic code. You know that the way that we translate DNA into proteins is we read bases. The DNA letters themselves, three at a time. And each triplet, or codon, corresponds to one of the 20 amino acids. Except for three special triplets that are called stop codons. So sometimes we want to look at a sequence and see if there's a stop codon in that sequence. And when we say in frame, we mean starting from either position zero, one, or two. Those are our three frames. Since there's three bases for codon we can start at any of three different positions, and get different codons throughout our sequence. So we want to walk through our sequence three bases at a time asking for each triplet or codon. Is this a stop codon? So let's write a program that does that. So we'll call our function has stop. And we're going to call it has stop .py, because it's a python program. So in our file, we'll of course start with our usual header to say that this is a python program. We'll define some code that determines the stop. That computes the stop code on function. And then we'll get our input. Remember we use raw input for Python 2 and the function input for Python 3. So we're going to ask the user to enter the DNA sequence. And of course, later we'll who how you might read that sequence from a file instead. And then we're going to check in an if statement, if the DNA has a stop codon. And if it does, we'll print out a message saying the input sequence has a stop codon. So we write if(has_stop_codon(dna)), that's doing a binary boolean test on our DNA sequence. And if that's true, we'll print out a statement saying that the sequence has an in frame stop codon. Otherwise, we'll print out that it does not. Alright, so we've left out, or we haven't written our actual function yet, but that's just an example of how we'd use it. Let's go on and look at how we'd write this function. So we're going to define our function using this special def keyword. The function's called has stop codons, so we write that in. It has one argument, DNA, which is a string with a DNA sequence in it. And then we start defining our function using the colon. We right out a documentation string in quotes. So we say, we'll say this function checks if a given DNA sequence has an in frame stop codon. Alright, so let's by default, we want to say it is not. So we'll define the return value is going to be called stop codon found. And we'll say that that's false. So we don't find one that will stay false. If we do find one, we're going to change that value to true. And we'll return that true value. All right, there's three stop codons. So in almost all living organisms these stop codons are the triples tga, tag, and taa. There are a couple of strange bacteria that don't use the same genetic code, but we'll just ignore that for now. So, lets look through our DNA sequence. We'll do that with a for loop that walks through our DNA sequence, starting from the beginning. And stepping through bases at a time. So we say four i in range from zero to the length of our dna sequence which we get by calling the built in len function. And we'll step through three at a time. So that's what this syntax shows you. Starting at position zero, going until the end of the sequence. The length of a sequence, moving incrementing i by three each time. So now we know where we're going to start. We start at position zero. So we'll go to the first, on the first iteration to the loop. We'll look at the first position in our sequence, or the zero position. And we'll get the codon. So the codon is going to be DNA, that's our string. From position zero i, to i + 3,. So that'll be 0 to 3. Which remember doesn't include the last position. So it'll be 0, 1, and 2. Were the first three positions. And we're going to call the lower method on our string, too. Just make sure that all this, if someone has given us a DNA sequence with a mixture of upper and lowercase letters, or maybe all uppercase. We're going to convert it to lowercase, so we can compare it to our little table of stop codons. So then we just ask. Now we've got our codon, the one we were looking at. And we ask, if that codon, that we're looking at, appears in our little list called stop codons. Which had remember, the three stop codons in it. And if it is, then we found a stop codon. So we reset our variable, stop codon found to be true. And now we're done, we don't really have to keep looking. Because this is a boolean function. We just want to know are there any stop codons in frame. So as soon as we find one we can stop, so we'll call break. Which means break out of this current statement that we're in. We don't have to keep walking through the rest of the sequence, and we'll just return stop codon found. So that's our function. Now, if we don't find any stop codons. The stop codon found variable will retain the value false. Will never reset it, so it will return false. So you may want to sometimes define a default value for your functions. We just gave an example where we started at position zero in the very first base of your sequence. We want that to be our default. But we to maybe allow the user to start at position one or two. Because there are three possible reading frames, as we call them, in a DNA sequence. So let's make a slight change to our function. And allow the user to specify what reading frame it will be in. So, here we've got pretty much the same function. Almost all the lines are the same. But notice in our def command we wrote has_stop_codon (dna, frame). So now, there's two variables, or arguments to the function. One of which is the dna sequences as before. And the other is the reading frame. And the other change to our function here, is that when we loop through our dna sequence, instead of going from 0 to the end of the sequence. We go from frame to the end of the sequence. Now if we passed in the value 0, this would be exactly the same as what we did before. But we don't want to pass in the value 1 or 2, and then it will start at position 1 or 2. So that'll work just like before. So let's try this out. Let's set our DNA sequence to be a short DNA sequence like the one I'm showing here. Notice there is a stop code on it, but at the second position. There's a TGA. So we call has stop codon dna starting in the frame zero. It's going to say false. Oh wait a minute. There is a stop code on there. Why did it return false? That's because we started at position zero. When we looked at the first triplet, which is atg. And then we moved over by three bases. And looked at the second triplet, which would be agc. So we didn't find any stop codons in that reading frame. If we instead start at position one. And reading frame one. So we'll call has stop codon with DNA, and give it the argument one. Now we'll find that TGA that starts in the second position, which remember, is at index one. So we'd like the has_stop_codon function to check for in frame stop codons using frame zero by default. So if we do that we like to. We call it like this. has_stop_codon(dna). We'd like to be able to do that without specifying the second argument,and just assume the frame is zero. Okay to use zero as the default parameter for a function. We have to specify that when we define the function. So here I've given you the same function again. But now when I define the function, I want to say the reading frame equals zero by default, And I do that by typing def has_stop_codon and listing the two variables, DNA, and then I write frame equals zero. Which means that frame will get the value zero if the user doesn't provide it. So everything is the same as before only frame has a default value. So now if we try to execute the function using this DNA sequence here. Which is slightly different from the ones before. Notice this DNA sequence starts with a-a-a and then the second code on t-g-a is the stop codon. So we can call has_stop_codon on dna. and it will say true. Because the is a stop codon in reading frame zero. But if we call has_stop_codon in frame one, which we do by specifying that second argument. Because it's no longer the default value. Then it returns false.