So now we move from sort of the SQL and the contracts we need to talk to a single table, to linking multiple tables together, and this is where the real power comes, and this is where the complexity comes. So up until now, it should be real simple. When you start a company, we're going to kinda pretend we're going to start a small company to build an application to manage music. When you start a company, you're going to build an application, lots of applications need databases. And you need to design the database for your application. And often this is very much a team effort with a lot of different people with skills. Some people worry about the user, end user capabilities, some people worry about the performance, etc. And so database design, what I'll show you coming here is just the beginnings of database design, but the basics are very powerful. And it's something where smart people should take time in database design. And so it's a very collaborative process and if you ultimately look at the database design for applications, we draw these crazy pictures, right? And in these crazy pictures we are capturing, this is basically multiple tables. Table, table, table, table, and this particular database is kind of about an events, like, you know, reserving the picnic shelter at a park, right? What times, who's going to be there, who reserved it? All these other things. And what we're starting to show is that how these tables are connected together. So each of these little arrows, so this is a table, and these are the columns in the table, and then some of the columns are special columns that represent relationships with other tables. And we're not going to go into all the detail, you can eventually go into some of the detail where you see what this little arrow means and that means one and this means many etc., etc.. And this there's all this fancy stuff and so we'll talk about this eventually. But this is sort of like what we're going for, we're going for a picture of how we want our application's data to be stored in the database. And this then becomes our schema and it's the contract. So, we made one thing with two, we made one table with two columns, and that was a schema and a contract, but now we're going to make multiple tables with many columns and some columns that are specifically there to connect from one to another. So if you're working in a large application, you might walk in and see a picture like this on the wall. And you take a look, whoa, that's really important, that must be really complex! And it, for this application, it probably represents years of clever engineering to make sure that the application runs well. But this is basically how this, I don't know what it is, open mecker medical records system, I just kind of borrowed the picture. This is how the data is stored in its database. And these things can be very complex and the project I work on called Sakai, it's probably four times bigger than this. But if you zoom in on it, which you can't, but it's just a table and some columns and some connections. And then, oh yeah, there's a table, well that one's kind of complex. Here's a table, some columns, and two connections. So, yes, it looks complex on the surface, but ultimately we're just trying to figure out which tables we are going to make, what we're going to put in those tables, and then how we're going to connect the tables together. And the connections are the thing that make these things so powerful. We could just put all these data in one file, but then this thing would run like terribly slow. And so the trade-off of thinking through how your data is going to look is that when you're done, it's fast. A lot of times we don't worry too much about how fast your program's going to run. But when it comes to scanning data, especially if it's a lot of data, you think about that a lot. So the whole idea is to figure out the data that you need to represent, and then drawing a picture, and then lines between those pictures, right? And the basic rule that we're going to use is don't put the same string data in twice. So for example, if we have a column of something, don't put Chuck in twice. So that's bad. So if you have some column and you're replicating the same string data twice, that's not good. What you want to do is you want to make another table and put a Chuck over here and give a number to Chuck like 1. And then put 1, 1, so to indicate that something belongs to Chuck, which means you're modeling data at a connection between one table and another. We'll go through this in super great detail. So the basic rule is don't put the same string data in twice, use a relationship instead. And the other thing is sort of like model the real world. If you have users and tracks and Christmas tree farms and whatever, you'll have a table for the Christmas tree farms and you'll have another table for the Christmas tree types and you'll have another table for whatever. And so often you're starting to sort of build an application and so the application that our little company is going to build is an application keeping track of audio. Our company has decided that people don't use albums anymore and aren't interested in buying whole CDs of music. So we're going to sell music by the track. I think this is a great idea, and I think we're all going to get rich with this little company. And this is the user interface that I invented, or I screenshotted from another vendor's. But, whatever this is, we are going to this is the program we're going to build. We're going to make a track-making thing. Now, just looking at this user interface we see some problems right away. And often, if you just try to turn this into a spreadsheet, these things would become the problem. And the problem has to do with replicated data in columns. And so, it's like great, this is all cool, until you have the same artist name in the columns. That could be once or hundreds, because Black Sabbath has written hundreds and hundreds of tracks, and then even in the album column, that's a problem. And then it gets even worse when you're in the genre column, you're putting the string in a zillion times. And if you've been writing Python programs they kind of finish really fast, and that's because the only assignments I've given you are tiny. But when you're going to do a million or a trillion things, the difference between metal and the number 2, 2, 2, 2, 2, to replicate this is actually very significant. Because this isn't just a six, or five character string, this could be a very long string, and so in your data you have to allow for very long strings. And so these are the problems right there, all this replication. Now it turns out that we have a user interface person in the room, and the user interface person we say like, hi, we're a database expert. We took a class online and we know that you're not supposed to replicate data and the user interface person says, oh no. This replicated data, we've done surveys and that's exactly what the user wants to see. And you can't argue with that. If that's what the user wants to see, that's what they see. But we still have to write an efficient application. And so that's where we start going through a data modeling exercise. Where we say, this is the kind of stuff we want to represent, and this is the kind of user interface we want to support, how can we build a really good data model? So it's not like you've got to change the user interface to make it good, you change the data model then to represent the stuff we want, and then construct it in a way that the user interface is exactly what the user wants. So the idea is you look at the data that your application is going to look, and we've got all these columns, and you say is this column represent a thing in the real world? Or is it just another attribute of a thing? So there's like a thing and then attributes of the thing, or two things. And so you've got to go across all of the columns, and so that's what we're going to do, we are going to look at all the columns and we are arguing. So right now imagine we're in a conference room, we're sitting around a table. And we're going to draw this picture, and this picture looks like the thing, it's got little boxes and lines, right? We're going to draw a boxes and lines picture, and we have a blank screen right now. So the first question, actually, in these meetings is commonly where to start. Because you've got a lot of stuff. In our example, it's really simple. And it turns out that it's not so critical where you start. Even if you start at the wrong place, eventually you're just going to have this web of information, and it's all connected, and it'll all work out if you do it right. But it does simplify the drawing of your picture if you start at the right place. And so, the way I was taught, to start at the right place, was to think about the thing that is the most essential to this application. What is the one-sentence description of this application? And in this application, it's a thing that manages tracks. It's not an album-managing thing, it's not a artist-managing thing, or a genre-managing thing, it's a track-managing thing. And we kind of see that in our user interface. We see that every line is a track. And so that makes it easy. The first table we're going to build is the track table. And once we build the track table, then we have to look at all the other things and say, look, which of these things are themselves tables, and which of these things are just attributes of track. Okay? And so it turns out that things like numbers, and like star is just a number, that's five, these are all fives, apparently I only like stuff or hate it, and then this is the number of plays. And so you go like oh, that's part of it, that's part of a track, that's part of a track. Well, and now we come to the three things that aren't part of tracks because they have the replicated data, that's the red flag that says uh-uh, not part of a track. So, somebody gets up to the board, grabs a piece of white board and they go like that's our first table, it's a track table and its going to have a rating, length, and a count. I guess there should be a title in there as well for this column here, so we have a title field, title. Okay, so now we're good, right? We've got that, right, we've got that. The question is, what's connected to that? Right? Well,what is the thing that's the next thing to draw? Well, tracks could be connected to artists, they could be connected to albums, so let's just say, kind of albums have many tracks. So, the next thing we're going to do is we're going to make an album table. And we're going to say on our white board, we're just drawing on a board right now, we're not writing code yet. That tracks belong to albums and now it's kind of easy that what albums belong to, well albums belong to like groups. Now if you're a music expert you kind of understand that we're oversimplifying things, and that's a fine argument to have when you're starting your company, but we're going to pretend the world is simple, and that tracks belong to albums, that belong to artists. So we're almost done with our meeting. So we got artists, albums, and tracks, and we've drawn a picture. And the question is, where does genre belong to? What does genre connect to? Does genre connect to an album? Does genre connect to an artist? Or does genre connect to a track? Now you might want to go to your iTunes and go to a track and change its genre. Change it to easy listening. Now if the track was, I mean if the genre was connected to the artist, I mean this is a album, sorry. It means that it would then immediately change all those to easy listening because that means that albums have an attribute of what their genre is. If it was the same for an artist, that means that all your AC/DC would be changed to easy listening. And if it was an attribute of track, only this one would be changed to easy listening. So the question is, when you go into that system and you change one thing, does it change all of these based on the album or the artist? Well, you can pause and go ahead and do that but, I'm going to tell you, it turns out if you change that it only changes one. So that actually tells you, in the meeting we're arguing about this. I think it should be here, and I think it should be here. And then you say well but if you do it this way, that means that it's going to change. Then the UI person is like no, no, no, we can't have it connect to the album because then our users will be so mad at us and blah blah blah blah blah, and we wouldn't make a lot of money. So when it's all said and done, you've argued through. It took you a lot longer to argue this last bit, this was the hard part, to say okay, genre belongs to track, it just does. And this is a simple, trivial, example of how the data model creates the features of the application. So that when we started we saw this big picture, that what's in the data model determines what the application is capable of doing. And we have to get the data model right, so that it can all be very efficient. So now that we've drawn this picture on the wall in our conference room, at some point we have to map this into a database, into a database structure. Okay? We're going to figure out how to actually map that logical picture that we just drew into a physical picture exactly how we're going to represent this in a database.