0:00

Let's take a moment to learn about using Accumulator Expressions with the $project stage.

Â Knowledge of how to use these expressions can greatly simplify our work.

Â One important thing to keep in mind is that Accumulator Expressions

Â within $project work over an array within the given document.

Â They do not carry values forward to each document encountered.

Â Let's suppose we have a collection named example with the schema.

Â If we perform this aggregation,

Â this will be the result,

Â an output document for

Â every input document with the average of that document's data field.

Â For this lesson, we're going to explore this data set.

Â As the average monthly low and high temperature for the United States,

Â as well as monthly ice cream consumer price index and sales information.

Â And here's what the data looks like in our collection.

Â We can see we have the trends array with

Â documents that contain all the information we'll need,

Â easy enough to work with.

Â Let's go ahead and find the maximum and minimum values for the average high temperature.

Â We'll explore two different methods to find the maximum.

Â First, we'll use the $reduce expression to manually find the maximum.

Â Before I run this, let's break it down.

Â Here, I'm specifying the $reduce expression.

Â $reduce takes an array as a simpler argument here.

Â For the argument to initial value,

Â the value or accumulator we'll begin with,

Â we're specifying negative Infinity.

Â I hope we'll never have monthly average high temperature of negative Infinity,

Â but in all seriousness,

Â we're using negative Infinity because

Â any reasonable value we encounter should be greater.

Â Lastly, we'll specify the logic to the in field here.

Â This is using the $cond conditional operator and saying,

Â if $$this.avg_high_tmp is greater than

Â the $$value which is held in our accumulator, then return $$this.avg_high_tmp.

Â Otherwise, just return the $$value back.

Â So, compare the current value against the accumulator value and if it's greater,

Â we'll replace it with a value which we just encountered.

Â Otherwise, we'll just keep using our current max value.

Â Notice the double dollar signs.

Â These are temporary variables defined for use only within the

Â $reduce expression as we mentioned in the arrogation structure and syntax lesson.

Â $$this refers to the current element in the array.

Â $$value refers to the accumulator value.

Â It will do this for every element in the array.

Â Okay, let's run this.

Â And we see the max_high was 87.

Â Wow, that was pretty complicated.

Â Let's look at an easier way to accomplish this.

Â I think we can all agree that this is much simpler.

Â We use the $max group accumulator expression to get the information we want.

Â And again, we get max_high of 87.

Â Okay, let's get the minimum average temperature.

Â Here, we use the $min accumulator expression,

Â and we can see our max_low was 27.

Â All right, we now know how to use max and min.

Â We can also calculate averages and standard deviations.

Â Let's calculate the average consumer price index for

Â ice cream as well as the standard deviation.

Â Here, we're calculating both in one pass.

Â For the average_cpi field,

Â we specified the $avg,

Â average expression, telling it to average of

Â the values in the icecream_cpi field in the $trends array.

Â And here, the cpi_deviation is calculated almost identically,

Â except we're using the population standard deviation.

Â We're using $stdDevPop because we're looking at the entire set of data.

Â However, if this was only a sample of our data,

Â we'd use a sample standard deviation expression.

Â Great. We can see that the average consumer price index was

Â 221.275 and the standard deviation was around 6.63.

Â We could use this information to find data that is

Â outside norms to point to areas that might need special analysis.

Â The last accumulator expression I'd like to show is $sum.

Â As the name implies,

Â $sum sums up the values of an array.

Â We can see that the yearly_sales were 1,601 million,

Â and that covers Accumulator Expressions available within $project.

Â Here are a few things to keep in mind.

Â The Available Accumulator Expressions in $project are $sum, $avg,

Â $max, $min, $stdDevPop, and $stdDevSam.

Â With in $project., these expressions will

Â not carry their value forward and operate across multiple documents.

Â For this, we need to use the unwind stage and group accumulator expressions.

Â For more complex calculations,

Â it's handy to know how to use $reduce and $map.

Â