[MUSIC] Okay, so we talked about physical data independence and we talked about algebraic optimisation. I wanna talk about another kind of data independence, which is Logical Data Independence. And so, we argued that physical data independence was this ability to insulate applications and protect applications from changes in the physical organization of the data. All right? So things were rearranged on disk, we don't wanna have to rewrite all the code in the application. And this is what databases provide and relational databases in particular do a great job of providing this. But if you go back to first paper, and even the quote I gave you, he talks about insulating applications from the changes to the internal representation, but also insulating applications from changes to some forms of external representation. And what he means by external is things like adding a column to a table. So this isn't an internal shuffling of the bits on the disk. It's actually a logic change to the table, there's more data there than there was before. But if you think about it if your code doesn't care about that new column you shouldn't have to rewrite it just because there is a new column. Okay. So the ability to provide this logical data independence is provided by this concept of views, and all relational databases have this concept. Somewhat surprisingly, I find it to be somewhat underused in practice. If you're using them right now, if you know what they are, great. If you're using them, even better. If you use databases but have never heard of views, then this is a great time to learn about them. Okay? So, what is a view? A view is just a query with a name. So, I write a query, I give it a name and I put it in the database. Now, I can then access that view as if it was a table in the underlying database itself, as if it was a physical table. So, why can we do this? Well, I talked about this notion of algebraic closure before. Right. And this is exactly what it powers, what allows us to do this. So we know that every query returns a table, right. We take tables on the input. We do some manipulation of them and we produce tables. So we say that the language is algebraically closed. And so any result of a view will always be something that we can then add other queries on. So we can stack queries on top of queries on top of queries on top of queries, okay. So why might we want to do this? So one reason is to protect the underlying data. You can assign permissions to tables. So for example, if you only want a particular user to see data associated with their account, you can write a view that filters everything out except for their account and then grant them access to that view. The most direct benefit of this is that it allows you to expose data according to a logical organization that makes sense for the user. So even things as simple as hiding some join. If you decide to reorganize your data into two tables requiring the programmers use joins to link them back up again, you can simply write a view that hides that join and let everyone access the result of it. Now, maybe this may sound expensive but the cool trick here is that because of this algebraic closure what happens is the user's query gets composed with your query that defines the view and the whole thing gets sent as one big block to the database for evaluation. So the database simply doesn't care whether it came as a view and then your user query or whether it came all as one, all is one query directly from the programmer. It's gonna optimize it the exact same way. So there's nothing but a benefit here. So let's see an example. So given the schema purchase and product, define a view called store price with two columns store and price, that has this definition. So select store and select price from purchase and product where the product IDs are equal. [BLANK_AUDIO This is a little funny because we didn't put pids here, so this is a little bit wrong. So assume that each one of these has a, let's assume that this is pid and assume that this is also pid and then it matches the query down here, okay? So this result is now like a new table and just like I said a second ago, you've now hidden the join from the users. And so, complexities like these column names, perhaps, you can insulate your users from and you can name them whatever you want. And so, this allows you to put this, this is what this logical data independence means, is that, no matter how I want to logically organize my tables, I can expose a different perspective on the data than I wanna have myself. And so this separates the people who are administering the data from the ones you're actually accessing it, okay. Logical data independence key idea. [MUSIC]