[MUSIC]. In a lot of our examples that we've been looking at. Both in our assignments and then in some of the text that we've been using in our slides to demonstrate different concepts. We've been using these format strings that we haven't really explained very well. And in this lecture, what I'd like to do is to go through and explain how to make those format strings. What this enables us to do, is it enables us to create text strings using variables whose values we don't know when the program is being written, but which we find out at compile time. I'm sorry, at run time. We might wanna use these strings for example, as a label on a button. Or we might wanna use these strings as output to the user somehow. But a lot of times there's instances where we don't necessarily know what the value is and we'd like to construct the string nonetheless. So the high-level, what we're gonna do is explain how to construct text using variables and using formatting rules. We'll start using the syntax and the paradigm of the C language. And then in a future lecture we'll talk about moving that syntax into the object-oriented method of doing things. So that you know both where it came from and also when you see the C language based syntax in Objective-C, you'll understand what's happening. And then you'll understand also, how Objective-C got to where it is with respect to the NFString object in particular. So do you remember seeing code that look a little bit like this? We had something that look like a time of day variable of type pointer to char being assigned a string like morning. And then something like a printf statement where we had a good parenthesis\n timeOfDay. And this would output maybe something like this, good morning. What I'd like to do is I would like to walk through each of the components of this code example in turn. To make sure that you understand what's happening here and also to explain for the first time what's happening with the format strings. And this will provide a good foundation for you being able to construct your own strings in your code. And in particular, a better foundation for being able to do it in an object-oriented method going forward. So what's going on here exactly? Well, let's start with the first piece. The first piece is a definition of a variable, that variable is called time of day. And then there's a pointer. And it is a pointer to type of character. So the timeOfDay type is char*, pointer to a character. It's interesting because it's a pointer to a single character and this is where we get into a particular convention of C. The name of the variable is timeOfDay and then there's this piece over on the right. It ends up being short hand for allocating and initializing some memory. We use double quotes in order to encapsulate what we call a string. A string means a string of characters. When we use that double quote notation, what it tells the compiler to do is to allocate some memory. At the same place where we allocate our global variables. And in this case, we allocate just enough memory for the letters m, o, r, n, i, n, g and then the number 0. So when we write down that text, "morning", that's an indication to the compiler to reserve enough memory for the word morning plus a 0 at the end. And then what the compiler returns when we use that syntax, is the address of that memory. And then we set the timeOfDay variable equal to the address of that memory. We sort of demonstrate that using our notation over on the right side of the slide here. And we know that the morning memory has an address and we know that timeOfDay is going to be assigned that address. And I don't specify it, I just indicate that it's some address and that's the same. We say timeOfDay points to the string morning. Using the box and line notation, sometimes it's a little bit easier to draw that same thing as something like this. TimeOfDay is a pointer and we draw an arrow to the memory that it points to when we're representing it visually. TimeOfDay points to a string, morning. Now as we said, the double quotes are special way of allocating memory. It's reserving a permanent chunk of memory outside of the stack just like global variables. And it's initializing it with those characters and notably, it's putting a 0 at the end. And that zero doesn't show up when you specify the double quotes. That's just a convention. Let's make sure we understand where these variables are as a result. TimeOfDay the pointer is actually on the stack. And so when the frame goes away in which the local variable, timeOfDay gets recycled by the operating system. TimeOfDay, the pointer will go away. But morning will remain in memory indefinitely, because it was defined during the compile process. So that's set aside to a chunk of memory with the zero. Now timeOfDay is a pointer to a single character, a single char. But in fact, morning consists of seven different characters and then this hidden zero. And so this just acknowledges this convention in the C language that a string is a sequence of characters and memory. That you follow along by adding one to the pointer until the point at which you get to zero. It's a very low-level way of thinking about a string. It's very tightly tied to the specifics of the memory. And it requires that zero to actually be there. Because if the zero's not there and you try and output this as a string, you'll just keep reading characters through memory. Which could consist of random stuff until you happen to get to a zero. Very low-level, very implementation specific, a lot of details about the specific architecture. This is a characteristic of the C language and it's a characteristic that goes away when we move into that object-oriented version. So the character pointer timeOfDay actually just points to an m and by convention, we continue reading memory until we get to a zero. This then, the string of characters up until the zero, becomes the string. So there's never actually a type called String in C. This is shorthand, overall this is shorthand for four things. First of all, it's shorthand for allocating enough memory for the letters. Second, it's shorthand for them copying those letters into the memory. It's shorthand for adding a zero at the end of those letters. And then it's shorthand for assigning timeOfDay equal to the location of that memory. The second line is a function call and here's where see that formatting happening with dynamic data. Before we begin, well, in this case, we know what timeOfDay is. But in the specific line that printf, printf stands for formatted printing, we don't necessarily know what timeOfDay is equal to. But we would like to format a string that includes its value. So the function that we're calling is called printf and there's two parameters. The first parameter is a string as well. It's a string just like we defined morning. It's separated or it's bounded by two double quotes and in the middle are a bunch of characters, Good %s\n. These are just characters. They form a string and they're pass as the first parameter. So just like morning, this syntax is allocating some space and memory. It's initializing that space and memory with these characters. It's putting a zero at the end and then it's returning the pointer. So the first parameter to printf is a pointer to char, and then the second parameter is also a pointer to char. It's just that we happened to give it a variable name, instead of directly putting morning there. So this first parameter's called the format string and the printf function interprets it very specifically. It is a string that has tokens in it and these tokens get replaced by data. In this case, the %s token gets replaced by whatever timeOfDay points to. So you'll see that in the output of this line, there's no %s and in fact, those two characters have been replaced with seven characters. The tokens describe the type of the data that's going to be coming in subsequent parameters to printf. In this case, %s means string or more specifically a pointer to char, which is then interpreted as a sequence of places in memory that ends in zero. That should be copied into the string that starts with good. And this is basically how you create strings from data in a program, you use variables and you use format tokens. The format token that you use depends on the type of the data that you're gonna be giving to the formatting string. So here are some examples. %d would be what you would put in your format string if you wanted to put an integer into your format string. It would translate that number into a string type. %f would be the token that you would put in if you wanted to put a double or a floating point number into your string. %p is a token that you put into your string if you want to know the value of a pointer, rather than what the pointer points at. So for the example above, instead of putting good %s, we've done good %p. Printf wouldn't have interpreted that as copying whatever timeOfDay points to. But instead taking the actual address that timeOfDay is and putting that into the string. %s knows to follow the pointer and instead points to whatever a char * pointer points to up until that final zero. There's some other ones you can do. You could do %ld, and that's the code that you would use for putting a long integer into your string. %5d which is what we saw before %d but with a 5 in addition between the % and the d is a formatting parameter, formatting code. It adds for many details to the token that's already there, so it's an int because it ends with a d. But what it does is when it translates that int into characters, it makes sure that that int takes up at least five spaces. And if the int is small enough that it doesn't need five spaces, it adds the space character in front of it. So that there'll be a larger gap between good and the integer. If in addition to that 5, you have a %d and you put 05, that's instructions to translate the int into your string. But make sure that it takes up at least five spaces, but instead of padding it with spaces, pad it with zeroes. So this example down at the bottom is a screen capture of what happens if you try and translate the integer 1 into a string using the format token %05d. It makes sure that the space reserved for outputting that integer is five characters long. And if it needs to add extra characters, it adds zeros instead of spaces. And that's what you end up getting. There are some other ones, it goes on and on. So instead of doing %d, if you put -5 in the front of it, that's gonna be an integer cuz you got the %d. And to also make sure that it's gonna take up 5 character spaces and if it needs extra spaces, it's gonna add a space character. But Instead of aligning the one to the left side, it's gonna align the one to the right side and put the spaces behind it. Because in a format string, you use this % character in order to indicate that what follows is a token that should be interpreted as the type indicator. So %d for example being an integer, if you actually wanted a % character to be in your string, you have to put two percents. So % % in your format string gets outputted as a single % to disambiguate it from the times when you mean for it to follow %d. So if you actually wanted %d to be output, you would do % % d. And those first two percents would be interpreted as percent character. If you want an integer, you'd do %d. Now %d would be interpreted as needing to be replaced by the following integer. If you wanted to put a double quote into this format string, because usually a double quote would indicate the beginning or ending. But if you actually wanted to output a double quote, you'd do a backslash before hand. These are called escape characters, so that you can escape the meaning of the character. Finally, if all these are overwhelming you, don't worry, no one really memorizes these. There's a few common cases like %d and %s and maybe %f that people know that are using format strings. But when you want specific kinds of formatting, typically people will go to the documentation. In order to remember exactly what the different characters and what the different syntax is for these format tokens. So if you're really hard-core, you can find all of the details of the format syntax in the formal IEEE documentation which is located at this URL here. So that's like an official standard for how you specify these tokens. But usually a web search for something like printf format tokens will return someone's quick reference on the Internet that's good enough. And you can always just test it out and see if it's doing what you want it to do. So the thing is that in these examples that we've looked at in this lecture so far, there's only been one token that we put into our format string. It may be the case that you would like to format a string that has multiple tokens. And if you wanna do that, you have to make sure that there's one parameter of the correct type for each format token that's in your string. If you have multiple tokens, therefore you need multiple parameters. So here is an example that's a little complicated that uses all the different things that we've talked about so far. So the first thing that we do is go through and we define number of variables. We define a pointer to char, a long integer, a floating point number, a character itself and an integer. The time of day string is assigned to the value morning. Allocating memory copying it over the whole deal. The long integer a is assigned this value that starts with 134. The floated point number B is assigned a value of 2.3. The character C is assigned the letter y. Now note in this case we only use a single quote, and by specifying it as a character rather than a pointer to character. We Communicate to the language that we're not trying to represent a string and so it's not zero terminated, it's just a single character. And finally we have the integer d whose value is 65. So then below in our printf statements, we use some formatting, in the first one, our format string is good%d. Sorry. Good%s,. You have %ld credits left \n. So that first %s indicates that there better be a pointer to char which follows and that's timeOfDay. Good. So timeOfDay gets interpreted as a string and gets inserted into our output. And then you have %ld, that means there better be a parameter which is a long integer that follows after that and there we pass in an a so that's good. And then that /n is a format character that says go to the next line rather than continuing on the same line when you output. So then when we look at our output our first line says good morning. Okay, good. Our percent S was replaced by what time of day pointed to, comma you have 1343489823 credits. Okay, good. So our %ld was replaced by the value of A. Credits left and then the new line. Okay, good. So our second example says our score is %f and your middle initial is /"%c/"%/n. Well the first one hopefully is clear, that %f is going to be replaced by a floating point number. That floating point number better be our first parameter. And it's b. Your middle initial is, okay, let's parse this. So backslash quotes is an escape character saying that we actually want the double quote output. %c then is interpreted as a character that's going to be put in there, and so there better be a parameter which is a character. C is our parameter. The third parameter to this printf statement. That's gonna be replaced there. Then we have another /", so another double quotes will be printed out. Then we have /n so it will go to a new line. And then finally we have a quote which indicates the end of the string. When we look at the output we see that. We get what we would have just specified. Your score is 2.3000000, okay so our %f was replaced by the value of b and the formatting adds zeros at the end. There's a ways of specifying with a % f how you want the floating point %f translated into a string, which you can look at the documentation, if you don't like those zeros for example. And you're middle initial is "y". Great. That's what we wanted. We wanted an actual double quote outputted. We wanted the value of what c pointed to, which is a y, and we wanted another double quote. And then we went to the next line. And then finally, in the last line, we see you have %d%% left to go /n. As you're reading that format string over, you see the first thing you get is %d. That's gonna be interpreted as a format token for an integer. So there had better be an integer as a parameter to printf. D is our integer parameter. After we interpret %d, we see %%. So that's gonna be replaced by a single percent character. And then we have left to go and then new line character. So in our output we see you have 65, that's the value of D, a percent character, left to go. So it did what we wanted it to do. In summary, format strings are a structured method for composing text in a program from data variables that you may not know before you start your program. Because the data is going to be changing as you run your program and you want to treat it abstractly as data. There are special tokens for each type of data that you might like to incorporate into your string. And these go are things like %d and %f. Each token additionally can be modified to format the data. And those modifications are typically between the percent and the character that you want to specify the type with. So %5d, %2f, things like that. What we have presented in this lecture, is the way of doing this within a C language paradigm. When we move in to using an object oriented paradigm in particular the NSString object. We'll see that the NSString object uses a lot of the same formatting tokens and same assignment, syntax and methodology that we see in printf. And so we'll see how this language has evolved from its C roots. As you're programming, you may still see a lot of the syntax from the C language that we described in this lecture. So it's worth talking about. But the common paradigm now and the best practices is to use an NSString and the format tokens that come with it. So that should be the next thing that we talk about. Thank you very much. [MUSIC]