[MUSIC] NoSQL databases are a hot topic in cloud computing. So what exactly are NoSQL databases, and how do we change the way that we interact with our data if we are using a NoSQL database? Well, generalizing about exactly what a NoSQL database is, is a little bit hard. But at the most fundamental level, one approach to thinking about many new SQL databases, is to think of them as giant hash table. And, and second we'll give an example of how this can be used to help understand what is going to be and efficient or, or less efficient design, when interacting with a NoSQL database. At the basic level NoSQL databases do what they say in their title, they don't give you full SQL like your used to with a relational data base. So you can't necessarily go and do table joins easily or other things that you would get with SQL. Now, if you don't know what SQL is, if you're not familiar with relational databases, don't worry about it too much. You'll still be able to get an understanding of what the key principles are behind NoSQL databases and how to work with them and design and optimize for them. There are a variety of styles of NoSQL databases. So, one of the common styles is a key value store. And this is really the most most, the closest thing to a pure hashmap. So you have keys, and you can go and look up values associated with them. And these types of databases often give you really high performance through this very simple but powerful mechanism. Another type or style of databases is ones that look like Google's Big Table, and these provide some greater capabilities in terms of the query and structure and layout of the database. Isn't there a number of open source alternatives that provide capabilities very similar to Big Table. Another type, is document-oriented, databases. So these provide much greater structure and flexibility in the values that are being stored, and the schemas of the particular values that are being stored in the database. And some of them even store pure JSON inside the values, so you can go and think of the values as JSON, or, or object data that's very flexible. And then the final type, which we are not going to talk about much is a graph database that's optimized for doing traversals on a graph. Like finding the shortest path or finding nodes in the graph that have a specific relationship. So, these are a variety of styles of NoSQL databases. What we're going to focus primarily, initially on databases that have this style and this style, and then we'll talk a little bit about Mongo DB, which is a document oriented database that we will understand how to operate with that as well. To help get an intuitive understanding of how NoSQL databases work we're going to generalize a little bit and pretend NoSQL databases all look like Java hashtables. And then we'll use that to help us understand how we can improve our structuring and interactions with the databases, if we have this mentality. So let's assume that our database is a big hashmap. And rather than parametrize this thing, I’m going to say were going to store purchases, by user. By user. Then we can go and construct a particular hash map, or some other map structure to store this. So what we'd like to do, is the keys will be the user, or the ID of the user and the value will be a list of the purchases that that user has made. So if we wanted to go and run a query to find the purchases, we could say something like list purchase P equals purchases by user.get. And we pass in some particular user. Now what happens if we want to go and search for the list off all users that have bought a particular item. Well suddenly, this query gets a lot more challenging in this map world. So suddenly, we have to go and iterate through all of the values to try to find a value that matches what we're looking for. So in this case, we might have to do something like for, list purchase, PN purchases, by user.values. So now we're going to iterate over every single value, and then we're going to say for, and we're going to iterate over the purchases. We'll call this PS, P and PS. And then we're going to search for purchases that match the particular, one that we're looking for. And if we find them, then we're going to try to get the associated key, of the the user. And in, in this case this actually would be hard to do. That we, we're not iterating over the keys at the same times, so it's going to be kind of challenging. Now there are ways we could do it. We could iterate over the keys, get the purchases for each user, then search the purchases. Which might be the better way to do it. But, the challenge is, as you can see either way, it requires a lot more work. If we have to go and search for a particular user that meets a criteria, so suddenly we need to know which keys have a specific criteria for their value, it becomes harder to go and get this data out. Whereas if we just want to know, what purchases does a user make, this map is really fast. But the moment that we want to know which users meet a particular criteria we have to go into a lot of work. And, in this case we're not even doing the work correctly. We'd have to go and iterate through the keys, get the list of purchases for each key. Oh, for each purch, purchase we'd have to iterate over it. We'd have to check if it met some criteria that we were looking for, and then if it did we'd have to store the key in another list and then finally return it. And that's a whole lot more work than a simple map lookup on a hash table, where we would go and just provide the key and immediately get all the values back. So how can we get around this problem? Well in relational databases normally we're trying to do something that's called normalization, that is we don't want to have duplicate data in our data base. We want to have the representation of some data item only exist once in the database we don't want to have duplicates of that data over and over. But in this case, we can do something to vastly improve the rate at which we can query things. And that is, if we want to be able to query by particular purchases too, if we want to be able to find all of the users that made a particular purchase, we can do some renormalization. That is, we can make, do some duplication to make this faster. So we can have a second map, and we can make this users by purchase. And we would go and construct this map and populate it in some way. And then if we want to know which users have, created, or have purchased a particular item. We could go and List, you know, User, and we could call this buyers, equals users, by purchase.get. And we could pass in some particular purchase that we were looking for. And that would instantly give us all the users that had bought that particular item. And we still have the ability to go in the other direction. If we wanted to say, list, purchases, they bought. We can still do this and say, purchases by user, get user. So in this case, we can do either query every efficiently. If we want to know all the people that bought some particular purchase or item, which we're going to call a purchase, we can simply provide that key and we'll get back all of, the list of those users. If we want to know all the purchases for a particular user, we can provide that key. And so, by duplicating the data, we've got the purchases in here and then, as values, and then we've got the purchases over here as keys. And similar, we've got the users over here as keys, and then down here as values. So sense, in a sense, we're duplicating the data. In order to make our query faster, we're being able to look things up easier rather than having to go and scan through all of the values and look for them. But in the process, we're duplicating this data and so we're going to have to do a little bit more work. When we, whenever we go and update. The part, purchases for a particular user. So a user buys something. We're now going to have to update it in two places. So the downside is, is that our data isn't normalized anymore. We're now having duplications, so when we go and try to update, we have to update it in more than one place. And if we want to go and do some type of restructuring of our database, we're going to have to and update multiple places and relationships between that data. But, the positive side is, is that we can get the queries really quickly. So, if you think, sort of, from this mentality, this gives you a bit of an intuitive sense of how NoSQL databases work. Is, on some level, they operate like hash tables. Even though many of them give you additional features and capabilities, this is one of the primitive ways that one can conceptualize how to improve the performance of the.