[MUSIC] Okay, last time we talked about algebraic optimization. And I argued that all three of these expressions, without going into a lot of detail. But I argued that all three of these were equivalent and they differed only in the order in which things were evaluated. Here you evaluate this join first and this join second. And in this expression you evaluate this join first and this join second, and here you sort of find all possible combinations of tuples and then filter that. So if you don't understand exactly what's going on in these expressions that's okay, you're not going to know that yet. We'll talk about it, in fact, in this segment I think, but the takeaway here is that there is three equivalent expressions and we don't necessarily which one is the fastest one to evaluate? But the database can figure this out, and does every time you write a query okay. And that's this notion of algebraic optimization. We don't, even if you're familiar with databases you may or may not be familiar with relational algebra, which should be strange because I've argued that it's the hallmark of databases and totally fundamental. So, why don't we think about programming databases in terms of writing relational algebra expressions? Well, another good idea, another key idea that's associated with a relational database is this notion of declarative languages. What we mean by declarative languages is that you specify the answer that you want, but you do not specify anything about how to get it. And so a relational algebra expression actually does specify an order, right, as I showed on this slide. Here's three different expressions that's indicating exactly which order to do every operation. That means that if you write an expression like this, you're instructing the computer, look do it in this particular order. So these declarative languages say look, we're just going to describe the properties that must be true of the result and we're gonna let the database figure out the right order in which to do this. So here's a quick example, so imagine you have two tables, one is order with three columns, order, date and account and another table with item with two columns, order and part. The semantics here is that this column indicates which order that item should be associated with okay. And so, if you want to say find all orders from today along with the items ordered, then you may write this query and if you've seen SQL plenty of times before, bear with me. And if you haven't, then pay attention. So select star give me all possible columns from the table order and all possible columns from the table item But I only want record such that this condition is true, where the order column from the order table matches the order column from the item table right? And, further, I only want orders from today where o.date = today(). So this is just conditions expressed over the results without any kind of idea of how to actually get this answer. So what automatically happens is that this query is translated into a relational algebra expression along the lines of what we've already seen. You know, here I've sort of just done a cartoon where you can say, scan the item table, scan the order table, select the record such that date equals today and then perform the join. Find all the records in order that correspond to, for each record in order find the corresponding records and item that match on order. Okay? So, this is happening every time you run a query, again. So, the SQL is the what, not the how. Given another example, there's three columns, product, purchase and customer. And the underlying in here we haven't talked about, but this is indicating what makes this record unique. And so here, the PID make the product unique. The CID makes the customer unique and the combination of PID and CID makes the purchase unique okay. So here's another SQL query, we say, select distinct product name. Why do I know it's the product name? Cuz I see an x here and I see an x here. And, the customer name, and I know it's the customer name cuz I see z here and I see a z here. This is an alias for the relation customer. From these two or three tables, Where the product ID in the product table matches the product ID in the purchase table and the customer ID and the purchase table matches the customer ID in the customer table. And that's a typo, it looks like that should be z. And then we want, but now we want only the products for which the price is greater than 100, and we only want the customers whose city is Seattle. All right, so what does this say in English? Well, if I need combinations of products and customers, you need print combinations of products and customers where the customer is in Seattle, and they pay for a product worth more than a 100 okay? So it's clear what we want but it's unclear how to get it. It's kind of a complicated query. [MUSIC]