Navigation

Data Aggregation with the mongo Shell

Overview

MongoDB can perform aggregation operations, such as grouping by a specified key and evaluating a total or a count for each distinct group.

Use the aggregate() method to perform a stage-based aggregation. The aggregate() method accepts as its argument an array of stages, where each stage, processed sequentially, describes a data processing step.

db.collection.aggregate( [ <stage1>, <stage2>, ... ] )

Prerequisites

The examples in this section use the restaurants collection in the test database. For instructions on populating the collection with the sample dataset, see Import Example Dataset.

In the mongo shell connected to a running mongod instance, switch to the test database.

use test

Group Documents by a Field and Calculate Count

Use the $group stage to group by a specified key. In the $group stage, specify the group by key in the _id field. $group accesses fields by the field path, which is the field name prefixed by a dollar sign $. The $group stage can use accumulators to perform calculations for each group. The following example groups the documents in the restaurants collection by the borough field and uses the $sum accumulator to count the documents for each group.

db.restaurants.aggregate(
   [
     { $group: { "_id": "$borough", "count": { $sum: 1 } } }
   ]
);

The result set consists of the following documents:

{ "_id" : "Staten Island", "count" : 969 }
{ "_id" : "Brooklyn", "count" : 6086 }
{ "_id" : "Manhattan", "count" : 10259 }
{ "_id" : "Queens", "count" : 5656 }
{ "_id" : "Bronx", "count" : 2338 }
{ "_id" : "Missing", "count" : 51 }

The _id field contains the distinct borough value, i.e., the group by key value.

Filter and Group Documents

Use the $match stage to filter documents. $match uses the MongoDB query syntax. The following pipeline uses $match to query the restaurants collection for documents with borough equal to "Queens" and cuisine equal to Brazilian. Then the $group stage groups the matching documents by the address.zipcode field and uses the $sum accumulator to calculate the count. $group accesses fields by the field path, which is the field name prefixed by a dollar sign $.

db.restaurants.aggregate(
   [
     { $match: { "borough": "Queens", "cuisine": "Brazilian" } },
     { $group: { "_id": "$address.zipcode" , "count": { $sum: 1 } } }
   ]
);

The result set consists of the following documents:

{ "_id" : "11368", "count" : 1 }
{ "_id" : "11106", "count" : 3 }
{ "_id" : "11377", "count" : 1 }
{ "_id" : "11103", "count" : 1 }
{ "_id" : "11101", "count" : 2 }

The _id field contains the distinct zipcode value, i.e., the group by key value.

Additional Information

In the MongoDB Manual, see aggregate().

In the MongoDB Manual, see also Aggregation Quick Reference, the SQL to Aggregation Mapping Chart, and Aggregation Introduction.