In the same way that we can do data cleaning in the aggregation framework, we can perform those same operations using the MongoDB query language. Basically, the choice at this point in the development of MongoBD is whether you want to work in a declarative framework, like the aggregation framework, where you simply specify what it is you'd like documents to look like, or whether you prefer to script your cleaning operations as you're probably accustomed to doing in Python or another programming language. Either works. My intention here is to give you exposure to doing either of those, so that whichever best suits your style, your needs or maybe even just your interests, you have what you need in order to get started. For those of you who are keen to be up to speed on a variety of different options and would like to learn the aggregation framework, I strongly encourage you to dive in there and wring all the value you can out of the aggregation framework. The agg framework is one of the most aggressively developed components of MongoDB because of its widespread use and utility for MongoDB's developer and data science communities. So we've taken a look at that. Let's look at using the MongoDB query language to do the same type of thing. So, here I have a script. I'm importing PyMongo and the MongoClient class and a few other utilities that I'm going to need, such as datetime and the regular expression library. The bulk of this script happens within this four loop. And the idea here is that one at a time, I'm defining updates for an individual document and then writing those updates to the database. We're going to do this through the use of the update_one method. This is a collection level method, and we'll be calling it on the movies collection. Now, for this particular script, so that I don't write any data in my movies_initial collection, I've actually loaded the same data into a different collection, because what we're going to be doing here is making a number of writes to the same collection rather than copying our updates to another collection. We're going to be making use of update_one. For update_one, we'll be passing two parameters. This is the most common way of using one of the update methods in MongoDB. There is a similar method called update_many that allows you to update multiple documents at once. But in our case, we need to update one at a time. A little bit later on, we'll look at a way of updating a bunch of individual documents in batch so that we have a substantially more efficient operation. But for introductory purposes, I think it makes sense to begin with updating one document at a time. So, the first argument to update_one is a filter that selects the document we wish to update. The _ID field contains a unique identifier for an individual movie in this collection. Here, this filter says, I'm interested in updating the document with the _ID value equal to the _ID of the document I'm currently processing in this for loop. And then, this variable here, update_doc, is a reference to a dictionary or document that defines the updates that I want to make. So what I'd like to do first is run this script, and because of this statement here, we'll be able to see exactly what updates we will make to documents in this collection once they're issued. What I mean by that is that, this will print out the value of the update_doc variable that will be passing to update_one each time through this loop. Now to run this, what I'm going to do is comment out the call because I don't actually want to make the update, I just want to see what would happen. And once we've taken a look at the output, then we'll go back and talk about how the script works, okay? So running this, let's take a look at the very first document output by this statement, okay? Remember, what I said is that we're building up an update document, a document that defines the updates we want to make to an individual document each time through the loop, and then calling update_one to write those updates to the collection changing the document identified by this _ID value each time through the loop. All of these update documents will be composed of two keys, a $set key and a $unset key. In the MongoDB query language, for update operations, $set has the semantics of specifying a field that you either want to add to the document in question or a field that you want to update with a new value. In the case of the first document, you can see the kinds of updates that we're making. These should look very familiar to you because they are the same types of updates that we were making using the aggregation framework in our project stage. I'm writing arrays for cast, countries and directors, as well as genres and two other fields. I'm changing the field name from fullplot with a lowercase P to fullPlot with an upper case P. I'm creating an imdb key with an embedded document and so on. Something new that I'm doing here is, I'm eliminating a number of different fields, I'm deleting a bunch of different fields. So there's two reasons why I delete fields in each of these updates that I'll be committing to the database. The first reason is, I'm getting rid of fields that use the singular form of the key, so I am writing a countries field here and deleting that old country key, and doing the same for the other four fields that required that type of change after I split their string values into arrays. These three values are being embedded in this document. And finally, the way this script works is I've defined it to delete all keys for which the value is simply the empty string. So rather than have a bunch of empty string values in the documents in my collection, I'm simply deleting those keys altogether. Remember that MongoDB has a flexible document model. It's okay to have documents that have a different shape in the same collection, it makes things a little bit more efficient for us, and through the use of the dollar exists operator, it's easy for us to identify documents that either contain or do not contain a particular key. So, all of my update documents, that is the values that I'll be passing here as the second argument to update_one, have this form, a $set field with a specific key and value that I am assigning and a $unset field with a list of fields to delete from the document. Now let's talk about how this script works.