In this lesson, we're going to talk about how we can migrate our schema using the aggregation framework. In a previous lesson, we saw how we can use mongoimport to import different data into MongoDB. Additionally, we saw how we can use some advanced functionality of mongoimport to specify the different types that we want each field to be. However, there are certain things that mongoimport does not support. For example, if we want to change the shapes of our documents when we are importing them, or if you want to add derived fields or calculate field star documents, unfortunately, mongoimport does not support these types of transformations. However, have no fear because the aggregation framework fills in these gaps. The aggregation framework is going to help us change the shapes of our documents, as well as add computed fields to our documents. For this lesson, we're going to start with the retail collection that we imported in a previous lesson using mongoimport. The documents look like the ones here to the left. Fortunately for us, we are already to import these documents with the different data types that we want which will save us from doing some additional work like parsing a string into an integer or parsing a string into an ISODate. However, our application would perform much better if all documents reflected an entire invoice rather than each document representing one order of an invoice. So in order to produce this type of document, we are going to need to apply some type of transformation. So of course, as always we're going to go ahead and import some dependencies. Here, I'm going to create a little helper function that's going to make it a lot easier for us to print out our documents. And then, I can go ahead and connect to my MongoDB free tier. And you'll notice that I'm using my free tier cluster instead of the course cluster on Atlas. And this is because during this exercise we're actually going to be writing data to MongoDB, which is not allowed on the course cluster. Additionally, I also want to point out that this retail data set that we're using is actually not the same one that we used from the Mongo import lesson. This is only part of that data set, and the reason for this is that the free tier cluster does not allow us to spill to disk when we are applying transformations on lots and lots of data, unlike our paid Atlas cluster for the course. However, we still wanted you to be able to perform this exercise, so we have uploaded this reduced data set as a handout. And you can download this handout and import it and follow along. So here, I'm going to go ahead and create an object to our reduce collection. And here's what the aggregation stage is going to look like in order for us to apply this transformation that we just talked about. What we really care to do is to put all the items that belong together under one invoice. And we're going to do that with the help of the dollar sign group stage. Note that we are not just grouping by invoice number but also with $CustomerID and $Country. Knowing this data, we know that these fields should have the same value for a given invoice. However, I want to make sure and catch any issues if they are not. A little more on how we do this later in the lesson. As for invoice date, it is possible that an order could have been added onto invoice, so we're going to go ahead and take the latest date. Alternatively, I could have built an array and accumulated all the different dates and made this field an array field. So that will give us the grouping that we want. But our document isn't going to have as nice as a shape as we would like. So to get around this, we are going to use $project. This will allow us to move around some different fields and rename them accordingly. And here's where we put the two stages together and execute it. And here, we can take a peek at what one of these documents would look like. So now, our order items are nice and embedded inside of an array field called Items. For those familiar with relational databases, we basically took two tables associated by a one to N relationship, and put the N side of the relationship inside an array field. Let's look at how we can add fields to these documents as well. Let's imagine that's important for our application to prioritize large orders by dollar value. So in order to find those orders, we need to calculate the total value of the order. We can easily do these with an aggregation pipeline over our new documents. What we would really like to do is multiply the unit price by the quantity for each item, and then sum all those up and add that as a field to the document. And here's a pipeline save that'll do that. Here, we're having an $addFields stage, where we name the field TotalPrice. And then, using the $reduce operator, we're able to map over the $Items array, setting the initial value of our reduce to zero dollar and zero cents. And then adding to that initial value the quantity multiplied by each unit price. And then it's as simple as adding that stage to our pipeline and looking at an example document. And now, you can see with this document, we had 12 items at our unit price of $6.95, which comes to the total price of $83.40. Now that we have our documents the way that we want, the next thing we can do is go ahead and save this output to a new collection with the $out stage. And here I'm specifying that the new collection's name will be orders. It's important to note here that $out does not append or modify an existing collection, but rather will either create a collection if it doesn't exist or override a collection with these new results if it does exist. We can go ahead and create this stage and then execute our pipeline. And now the output of this pipeline, the transformation of the documents in the way that we want, are now saved in the orders collection. And so now we can very easily just go ahead and access this new orders collection. Do a find one, and here you can see we have the items array and here you can see the total price. Now, earlier I said I wanted to verify that an invoice number was always going to be with the same $CustomerID and $Country. And we did this by grouping these three fields together. Now, if either of these two fields were not identical, we would have gotten two documents with the same invoice number. Now, because _id has to be unique within a collection, the server would fail to insert documents that had the same _id. If this is not evident to you, you can actually move the InvoiceDate into _id and to the grouping section. And then if we go ahead and add that stage and execute it, we now get a duplicate key error. Let's recap what we've learned. We saw how we could shape and enhance our documents by using the aggregation framework. We did see how the aggregation framework does not allow us to update documents in-place, but rather replaces them with the $out command. But we did see how we can use the $out stage to create a new permanent collection of documents through the aggregation framework.