In this lecture, I will gradually introduce you towards writing your own programs. You'll learn about basic data types and operations that you can apply to them. By the end of this lecture you'll be able to write a Python program to compute a DNA sequence's GC percentage. So let's start the Python Interpreter. Just type Python in a window terminal and now let's do our first problem. Print hello world. What happens? Python Interpreter says "Hello world!" back to you. First I'll talk about two basic data types, numbers and strings. You can just use Python as a calculator. For instance, let's just do a simple mathematical operation, an addition. 5+5 Python Interpreter writes back to you is 10. And then let's try something more complicated, 10.5-2*3 and Python gives immediately the answer 4.5. Let's raise something to the power, 10 to the power of 2. Notice here you have two double stars that you will use to compute powers. This is obviously 100. You cannot so do a floor division. This will just describe the fractional part of the division. Or you can compute the remainder of the division by using the percentage operator. So 17 % 3 makes 2. Now, the order of operation in Python is the same as in math. So, multiplication takes precedence over addition or subtraction. So the answer to this question, 5 * 3 +2 will be 17, of course. Numbers have different types. You can have integer numbers and to find out the type of a number, if you are confused, you can just write the built in function type. Type(5) will just give you the answer type 'int', so this is an integer. You can also have real numbers. Real numbers have also fractional parts. So type(3.5) will return to you type 'float'. In computers, we call float the real numbers. You can also do division. 12/5, what happened here? Why did the Python Interpreter said 2? Well, this would only happen in Python 2. In Python 3, actually this would be the real result but with Python 2, we have to tell the computer that 12 is a float. So it's a real number that we want to divide by 5 and not an integer. So now the correct answer is 2.4. You can do this in a different way, like for instance, specifying that 12 is a real number by adding a .0 after the number 12, and then write a division. The answer is also correct in this case. You can also have complex numbers in Python, but these are usually rarely used. Anyway, it's good to know that if you need them, you can use them. So, type(3+2j), j in Python is an imaginary number, will give you a type of 'complex'. You can do all sorts of operations with complex numbers, like raising to the power of 2, and Python will do this easily for you. So, let's learn about strings now. Steven, what can you tell us about strings? >> Strings are a really useful data structure in Python, especially when we're dealing with biological data sets. DNA sequences are really nothing but strings of letters A, C, G and T. Protein sequences are also strings of letters with a 20-letter alphabet instead of a 4-letter alphabet. So if you're doing Python programming for biological sequence analysis you're going to use a lot of strings. So a string in Python is just a series of letters surrounded by quotes. You can just type in a single quote, 'atg', and another single quote, and Python will interpret that for you and return the string 'atg'. You can also use double quotes. So we can do the same exact thing, we can type "atg", and Python again will return the same string "atg". However, we can make a longer phrase like this, 'This is a codon, isn't it?' we could type. And put quotes around that. If you did that to the Interpreter, you're going to get a strange thing, you'll get an error message, invalid syntax, and you'll see there's a little, tiny caret, a little up arrow thing under the letter t in isn't. Well, what happened there was if you're reading it as English then you would just type isn't it as you did and maybe not thinking that the single quote that you're putting there is actually matching the single quote used at the beginning of the line. So single quotes have to come in pairs and double quotes have to come in pairs. As soon as you put in this single quote after the letters I-S-N, that finished off that first string and now you just type the letter t without anything in front of it and Python doesn't like that. So you can't do that. So if you want to have something like a single quote inside of your string, then you can do that, use double quotes, and then type your string as "This is a codon, isn't it?" putting double quotes around it and Python will read along your string until it gets to the second double quote before it closes things off. So that's perfectly legal and Python will return the whole string for you. Or on the other hand, and this is true not just for quotes but for other special characters, you can use this backslash character as an escape character, which mean it tells Python, look, whatever is coming next, treat this as a character and don't try to interpret it. So we could type that same string, This is a codon, isn\'t it? In single quotes if we put a backslash in front of the first occurrence of the single quote in the word isn't. And that would work too. So, strings can also span multiple lines. It's often the case with DNA sequences that they are very long. We certainly can't put them on one line. So, if we want to have them spanning multiple lines we can use this special triple quote trick where we type in triple quotes and then we start typing our string and we just keep typing and typing across multiple lines. And finally we put in another triple quote, and that will close off the string. So we could here type in a DNA sequence, and type in a second DNA sequence with various, hitting the return in between. And Python will interpret this as one single string. You notice in the string that it returns at the bottom, it starts off with \n, and you see several \ns in the middle of the string. So that's a special character that signifies a new line. So what you actually did there, if you typed that whole string in, was you said to Python, here's a string. And in this string I not only have letters, but I also have this special character, the new line character, that's part of my string. So that's also a character and you can include that in strings as well. So that's an example of that \n there as an example of an escape character. So, escape is a term we use in computer science to mean that we're doing something special for a particular character. And in Python, backslash is our special escape character. So if we want to indicate a special character, one that we can't normally print or type, we use backslash for several of these. So for example, \n is a new line or carriage return character. \t is a tab character. If you want to type the backslash itself you can type \\, because obviously you have to have some way to type in backslash. And if you wanted to stick a double quote, say, in the middle of your string, you could do \". So now I'm going to let Ella tell you more about how you might print strings out a little more nicely. >> If you find strings at the prompt in Python you saw the Python interpreter gets the back to you, and it prints them. But sometimes they might not be printed nicely. As you said sometimes they can have escape characters like /n. But Python has a built in function which is called Print that you can use so that these strings can be printed more nicely. Let's try to see what it does. Let's do the same string that spends multiple lines that we saw before and we typed the prompt and do it by enclosing into a print function. You can also use back slash after the first triple cause. This tells Python you don't need to put a new line here. So we want to print this starting from the first sequence, always the head of the dna1 and what happens? Now the string looks really nice, no more backslash ends. So let's talk a little bit about basic string operators. We saw with numbers you can do additions, subtractions, raising to power, and all kinds of other operations. But what can you do with strings? We can concatenate strings if we use the plus sign, or we can replicate strings, or copy them, by using the multiplication sign or we can test if a certain character is in a given string or is not by using the keywords in or not in. Let's take a few examples. Let's add two strings and see what item it give us. Let's add the strings atg and another string composed of atg's and c's and we can see that Python puts them all together. Let's try copying a string. In this case, we want to copy the string atg three times. Python will spit it back to us three times. Or we can test if a string is enclosed in another string and Python will say yes, this is true in the case of atg enclosed in the bigger string starting with atg. Or sometimes a character is not included in the string. Then in this case, let's test if character n is included in a string only containing a, t, and g's. Of course this is false.