Now that we've explored the data set a bit, at least for one field, let's take a look at how to filter to get just those movies that match a particular criterion. There are several ways we can do this. Compass supports the MongoDB query language, there is a $match stage in the aggregation framework for filtering, and there's a collection method called find. Which we can use from a Python application. All of these support the same filter syntax and set of operators. The one exception is Compass, which at this time supports only a subset of the MongoDB query language. Though, it is a very large subset. Let's do some simple queries in the aggregation framework first. The syntax for writing filters in MongoDB is based on documents. You specify one or more keys on which you would like to filter and then for each key provide an expression that defines the criteria that matching documents must meet for that field. As a simple example, lets search for movies that use both Korean and English. To do that, we'll use a $match aggregation stage. Have a look at the syntax here, there is an equality match on the language key. If you're building a Python application such as a web app, it's more likely that you'll use a query language method to filter documents. For filtering you will most commonly use the find collection method. Let's take a look at an example. Here we define a filter, note that it's just the filter portion of the match stage we just looked at. Both the MongoDB query language, and the aggregation framework use exactly the same syntax and operators for expressing queries. To apply this filter, we simply pass it to the find method. Remember that find is a method on the collection class. So here, we're calling find on our movies_initial collection that we been working with all along. find returns a cursor that enables us to iterate the results. A cursor is simply a pointer to where we are in a list of query results. Calling list here causes Python to iterate through the cursor and create a list. pprint prints the output that we'll see when we run this script. And we can see here that what we get is in fact a list or array, and that everyone of the results in this list does express a combination of Korean and English for the language field. We can do this same thing in Compass, and it uses the same syntax, or very nearly. The only difference is that Compass likes double quotes rather than single quotes. And applying that filter, we can see that the result set is reduced to 27 documents. And again, if we scroll through these results, we see that the combination of Korean and English is what's used throughout for the language filed. This type of query is called an equality filter, because we're matching on exact values for a key. As I mentioned earlier, we can filter on multiple fields. So to provide an example of that lets go ahead and add an additional selector to this filter. And applying that filter we can see that our result set is reduced from 27, to just 2. Because now we're requiring that all documents returned in our result set have a combination of Korean and English as their language, and a rating of unrated. Now, one last thing I'll point out about the MongoDB query language, as expressed in Compass, is that for keys, it is not necessary to use quotes. Compass will correctly interpret these as keys. And we get a slightly cleaner looking filter as a result. Of course in Python, when constructing filters it is necessary that you use quotes. Because you're really just building a dictionary that is passed to find as a filter. Now, I'd love to go further and show you how to do range queries on fields like runtime, and maybe even look at interesting combinations of genre, for example. But the fact is, that this data set is a bit of a mess. Values like runtime should be integers and languages, genre and several other fields should be arrays. So that we have maximum flexibility with respect to filtering this collection. For the types of analysis, we need to run and for when we build a map, a little later in this course. So let's take a look at cleaning up this dataset.