Next, we'll describe aggregation functions. We have seen the first query before. Select count(*) simply translates to a count function. Now we could also say db.Drinkers.find.count. But using count directly is more straightforward. Now, let's ask to count the number of unique addresses for drinkers. So, we don't care what the address is. We just care if it exists. This is accomplished through the $exists:true expression. Thus, if an address exists for a drinker, it will be counted. Another area where we need to count is when we have an array valued attribute, like places. If we just want the number of elements in the raw list, we'll write db.country.findplaces.length and we'll get six. However, if we want distinct values, we'll use distinct instead of find and then use the length for counting the number of distinct elements, in this case 4. Now, MongoDB uses an internal machinery called the aggregation framework, which is modeled on the concept of data processing pipelines. That means documents enter a multi-stage pipeline which transforms the documents at each stage until it becomes an aggregated result. Now we have seeing a similar mechanism for relational data. The aggregation pipelines starts by using the aggregate primitive. The most basic pipeline stages provides filters that operate like queries and the document transformations that modify the form of the output document. The primary filter operation is $match, which is followed by a query condition. In this case, status is A. And expectedly, the $match operation produces a smaller number of documents to be processed at the next stage. This is usually followed by the $group operation. Now this operation needs to know which attributes should be grouped together. In the example here cust_id is the grouping attribute so it is passed as a parameter to the $group function. Now notice the syntax. _id:$cust_id says that the grouped data will have an _id attribute, and its value will be picked from the cust_id attribute from the previous stage of computation. Thus, the $ before the cust_id is telling the system that cust_id is a known variable in the system and not a constant. The $group operation also needs a reducer, which is a function that operates on an activity to produce an aggregate result. In this case, the reduce function is sum, which operates on the amount attribute from the previous stage. Like $cust_id, we use $amount to indicate that it's a variable. As we saw in the relational case, data can be partitioned into chunks on the same machine or on different machines. These chunks are called chards. The aggregation pipeline of MongoDB can operate on a charded collection. The grouping operation in MongoDB can accept multiple attributes like the four shown here. Also shown is a post grouping directive to sort on the basis of two attributes. The first is a computed count variable in ascending order. So the one designates the ascending order. The next sorting attribute is secondary. That means if two groups have the same value for count, then they'll be further sorted based on the category value. But this time the order is descending because of the -1 directive In course two we have seen Solar, a text search engine from Apache. MongoDB has a built in text search engine which can be invoked through the same aggregation framework we saw before. Imagine that MongoDB documents in this case are really text documents placed in a collection called articles. In this case, the $match directive of the aggregate function must be told it's going to perform a text function on the article's corpus. The actual text function is $search. We set search terms like "Hillary Democrat" such that having either term in a document will satisfy the search requirement. As is the case of any text engine, the results of any search returns a list of documents, each with a score. The next task is to tell MongDB to sort the results based on textScore. What's the $meta here? Meta stands for metadata, that is additional information. Remember that the aggregation operations are executed in a pipeline. Any step in the pipeline can produce some extra data, or metadata, for each processed document. In this example, the metadata produced by the search function is a computed attribute called textScore. So this directive tells the system to pick up this specific metadata attribute and use it to populate the score attribute which would be used for sorting. Finally, the $project class does exactly what is expected. It tells the system to output only the title of each document and suppress its id. The last item in our discussion of MongoDB is join. We have seen that join is a vital operation for data management operations. Interestingly, MongoDB introduced this equivalent of join only in version 3.2. So, the joining in MongoDB also happens in the aggregation framework. There are a few ways of expressing joins in MongoDB. We show one here that explicitly performs a join to a function called look up. We use an example form the MongoDB documentation. Now here are two document collections, order and inventory. Notice that the item key in orders has values abc, jkl, etc. And the sku key in the inventory has comparable values abc, def, etc. So these two are joinable by value. The way to specify this join, one can use this query. The db.orders.aggregate declaration states that orders is sort of the home, or local collection. Now in the aggregate, the function $lookup needs to know what to look up for each document in orders. The from attribute specifies the name of the collection as inventory. The next two parameters are the local and foreign matching keys, which are item and sku, respectively. The last item, as:, is a construction part of the join operation which says how to structure the match items into the result. Now before we show you the results, let's see what should match. The abc item in order should match the abc in sku. Similarly, the jkl item should match the jkl in sku. Okay, but there is one more twist. Here is the actual result. The first two records show exactly what we expect. There is a new field called inventory-docs in the batching record. The third record however, shows something interesting. Inventory has two records shown here, what do they match? Now they match the empty document in orders because orders has a document whose item field is null. So it matches documents and inventory where the sku item is also null, explicitly as in document 5, or implicitly as in document 6. This concludes our discussion of queries in the context of MongoDB.