Last time, we talked about how to find the data that we need to collect to solve a particular problem. So assuming you have found that data and collected it, the next thing you need to do is store it. So in this, lecture we'll talk about how we can appropriately store the data we've collected. I know you'd like to hear about my mom. So I'll tell you about my mom. My mom is an avid reader, she read tons of books, and she goes to the library to take those books out. The problems she'd like to solve is, she would like to make sure that when she goes to the library to take out books, she doesn't take out books she's already read, unless she deliberately does that. Some of us have favorite books, we like to go reread every year or two or something. But in general, many of us would like to go read new books as well. So she would like to be able to store data that will help her not take out duplicate books. How could she possibly store that data? Well, there are a variety of different ways. She can memorize every book and author she's ever read. That's really difficult. I will say this bullet originally just said memory, but of course, computers have memory. So really in her brain, she could memorize everything, and that's hard to do. She could write it on paper, she could keep a list, and just each time she reads a new books, she could write down whatever information she needs to make sure she doesn't get a duplicate. Usually, title and author is sufficient. She could put it in a text document. So as we step from these manual things to electronic mechanisms for storing data, usually we get more robust storage. We'll talk about that soon. Or she could put it in a database, another way to store data so that we can do a variety of different operations on that data. We do need to be careful that after collecting the data, we put it in a form that's actually going to be easy for us to use so that we can in fact use that data to solve whatever problem we're solving. There are some pretty common characteristics of our data storage approach that we would like to have available to us, based on the way we decided to store data. So the first thing is typically we want the data that we've collected to be searchable. We want to store it in a form that makes it easy to find a particular piece of data. So if my mom is at the library, she wants to be able to say, "Have I read this book?" So she probably wants to be able to search her list of books she's already read by title. We will find, of course, I'm using the book example, but these characteristics are true for lots of data for lots of problems, so you want to be able to search it. You also probably want to be able to sort it. So if you want to be able to scan for books by a particular author, you would like all the books by a particular author to be next to each other, like contiguous in your data storage so that you can have them sorted, let's say alphabetical by author's last name. When my mom reads another book, she puts it at the end, but you'd want to actually get it sorted into the appropriate place, based on an author's last name is a reasonable thing to do. That also is tied into modifiable. We want to, unless we're going to just go collect our data and never collect it again, which for some of our examples especially the application examples is a bad idea. If you have a navigation system, you say, "Well, this is what the streets looked like in 2015, so they'll probably never change. That's a bad plan. You need to actually go collect more data, so you want the way you've stored your data to be modifiable. If my Mom goes and reads another book, she should be able to add that to the storage of her data and be able to have it work properly so that she can actually use the data in an efficient way. The last one is summarizable. This is a little weaker for a book example, though not totally. But for lots of other things, we'll talk about some statistics like mean and standard deviation, and so on, a little later in the course. So we'd like whatever data we have to be summarizable in some way. My mom might be not just trying to avoid a duplicate. She might say, "I wonder who my favorite author is.?" Now, assume that like some of us, if you start reading a book and you don't like it, there are so many books in the world that just give up. You might store that one too to make sure you don't take it out again. But you might want to say, "Well, so who do I read the most? Who's my favorite author? Is it Charles Dickens? Is it Stephen King? Is it David Baldacci? Who is it that is our favorite author? So if we can summarize, we could count numbers of books by a particular author that might help us as we decide to pick a book. It wouldn't necessarily keep us from picking a duplicate. We'd need some other information, but we might say, "Gee, I just want to get a book that I know I'm going to like." Picking your favorite author is one way to do that. To recap, in this lecture, we've discussed various ways that we could store the data that we've collected to solve a particular problem, and we've talked about important characteristics of our storage technique that tend to be common across many problems we're trying to solve with our collected data.