Navigation

$group (aggregation)

$group

Groups input documents by the specified _id expression and for each distinct grouping, outputs a document. The _id field of each output document contains the unique group by value. The output documents can also contain computed fields that hold the values of some accumulator expression.

Note

$group does not order its output documents.

The $group stage has the following prototype form:

{
$group:
{
_id: <expression>, // Group By Expression
<field1>: { <accumulator1> : <expression1> },
...
}
}
Field
Description
_id
Required. If you specify an _id value of null, or any other constant value, the $group stage calculates accumulated values for all the input documents as a whole. See example of Group by Null.
field
Optional. Computed using the accumulator operators.

The _id and the accumulator operators can accept any valid expression. For more information on expressions, see Expressions.

The <accumulator> operator must be one of the following accumulator operators:

Name
Description
Returns the result of a user-defined accumulator function.
Returns an array of unique expression values for each group. Order of the array elements is undefined.
Returns an average of numerical values. Ignores non-numeric values.

Returns a value from the first document for each group. Order is only defined if the documents are in a defined order.

Distinct from the $first array operator.

Returns a value from the last document for each group. Order is only defined if the documents are in a defined order.

Distinct from the $last array operator.

Returns the highest expression value for each group.
Returns a document created by combining the input documents for each group.
Returns the lowest expression value for each group.
Returns an array of expression values for each group.
Returns the population standard deviation of the input values.
Returns the sample standard deviation of the input values.
Returns a sum of numerical values. Ignores non-numeric values.

The $group stage has a limit of 100 megabytes of RAM. By default, if the stage exceeds this limit, $group returns an error. To allow more space for stage processing, use the allowDiskUse option to enable aggregation pipeline stages to write data to temporary files.

Tip
See also:

If a pipeline sorts and groups by the same field and the $group stage only uses the $first accumulator operator, consider adding an index on the grouped field which matches the sort order. In some cases, the $group stage can use the index to quickly find the first document of each group.

Example

If a collection named foo contains an index { x: 1, y: 1 }, the following pipeline can use that index to find the first document of each group:

db.foo.aggregate([
{
$sort:{ x : 1, y : 1 }
},
{
$group: {
_id: { x : "$x" },
y: { $first : "$y" }
}
}
])

From the mongo shell, create a sample collection named sales with the following documents:

db.sales.insertMany([
{ "_id" : 1, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("2"), "date" : ISODate("2014-03-01T08:00:00Z") },
{ "_id" : 2, "item" : "jkl", "price" : NumberDecimal("20"), "quantity" : NumberInt("1"), "date" : ISODate("2014-03-01T09:00:00Z") },
{ "_id" : 3, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" : NumberInt( "10"), "date" : ISODate("2014-03-15T09:00:00Z") },
{ "_id" : 4, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" : NumberInt("20") , "date" : ISODate("2014-04-04T11:21:39.736Z") },
{ "_id" : 5, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("10") , "date" : ISODate("2014-04-04T21:23:13.331Z") },
{ "_id" : 6, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("5" ) , "date" : ISODate("2015-06-04T05:08:13Z") },
{ "_id" : 7, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("10") , "date" : ISODate("2015-09-10T08:43:00Z") },
{ "_id" : 8, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("5" ) , "date" : ISODate("2016-02-06T20:20:13Z") },
])

The following aggregation operation uses the $group stage to count the number of documents in the sales collection:

db.sales.aggregate( [
{
$group: {
_id: null,
count: { $sum: 1 }
}
}
] )

The operation returns the following result:

{ "_id" : null, "count" : 8 }

This aggregation operation is equivalent to the following SQL statement:

SELECT COUNT(*) AS count FROM sales
Tip
See also:

The following aggregation operation uses the $group stage to retrieve the distinct item values from the sales collection:

db.sales.aggregate( [ { $group : { _id : "$item" } } ] )

The operation returns the following result:

{ "_id" : "abc" }
{ "_id" : "jkl" }
{ "_id" : "def" }
{ "_id" : "xyz" }

The following aggregation operation groups documents by the item field, calculating the total sale amount per item and returning only the items with total sale amount greater than or equal to 100:

db.sales.aggregate(
[
// First Stage
{
$group :
{
_id : "$item",
totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } }
}
},
// Second Stage
{
$match: { "totalSaleAmount": { $gte: 100 } }
}
]
)
First Stage:
The $group stage groups the documents by item to retrieve the distinct item values. This stage returns the totalSaleAmount for each item.
Second Stage:
The $match stage filters the resulting documents to only return items with a totalSaleAmount greater than or equal to 100.

The operation returns the following result:

{ "_id" : "abc", "totalSaleAmount" : NumberDecimal("170") }
{ "_id" : "xyz", "totalSaleAmount" : NumberDecimal("150") }
{ "_id" : "def", "totalSaleAmount" : NumberDecimal("112.5") }

This aggregation operation is equivalent to the following SQL statement:

SELECT item,
Sum(( price * quantity )) AS totalSaleAmount
FROM sales
GROUP BY item
HAVING totalSaleAmount >= 100
Tip
See also:

From the mongo shell, create a sample collection named sales with the following documents:

db.sales.insertMany([
{ "_id" : 1, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("2"), "date" : ISODate("2014-03-01T08:00:00Z") },
{ "_id" : 2, "item" : "jkl", "price" : NumberDecimal("20"), "quantity" : NumberInt("1"), "date" : ISODate("2014-03-01T09:00:00Z") },
{ "_id" : 3, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" : NumberInt( "10"), "date" : ISODate("2014-03-15T09:00:00Z") },
{ "_id" : 4, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" : NumberInt("20") , "date" : ISODate("2014-04-04T11:21:39.736Z") },
{ "_id" : 5, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("10") , "date" : ISODate("2014-04-04T21:23:13.331Z") },
{ "_id" : 6, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("5" ) , "date" : ISODate("2015-06-04T05:08:13Z") },
{ "_id" : 7, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("10") , "date" : ISODate("2015-09-10T08:43:00Z") },
{ "_id" : 8, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("5" ) , "date" : ISODate("2016-02-06T20:20:13Z") },
])

The following pipeline calculates the total sales amount, average sales quantity, and sale count for each day in the year 2014:

db.sales.aggregate([
// First Stage
{
$match : { "date": { $gte: new ISODate("2014-01-01"), $lt: new ISODate("2015-01-01") } }
},
// Second Stage
{
$group : {
_id : { $dateToString: { format: "%Y-%m-%d", date: "$date" } },
totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } },
averageQuantity: { $avg: "$quantity" },
count: { $sum: 1 }
}
},
// Third Stage
{
$sort : { totalSaleAmount: -1 }
}
])
First Stage:
The $match stage filters the documents to only pass documents from the year 2014 to the next stage.
Second Stage:
The $group stage groups the documents by date and calculates the total sale amount, average quantity, and total count of the documents in each group.
Third Stage:
The $sort stage sorts the results by the total sale amount for each group in descending order.

The operation returns the following results:

{ "_id" : "2014-04-04", "totalSaleAmount" : NumberDecimal("200"), "averageQuantity" : 15, "count" : 2 }
{ "_id" : "2014-03-15", "totalSaleAmount" : NumberDecimal("50"), "averageQuantity" : 10, "count" : 1 }
{ "_id" : "2014-03-01", "totalSaleAmount" : NumberDecimal("40"), "averageQuantity" : 1.5, "count" : 2 }

This aggregation operation is equivalent to the following SQL statement:

SELECT date,
Sum(( price * quantity )) AS totalSaleAmount,
Avg(quantity) AS averageQuantity,
Count(*) AS Count
FROM sales
GROUP BY Date(date)
ORDER BY totalSaleAmount DESC
Tip
See also:

The following aggregation operation specifies a group _id of null, calculating the total sale amount, average quantity, and count of all documents in the collection.

db.sales.aggregate([
{
$group : {
_id : null,
totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } },
averageQuantity: { $avg: "$quantity" },
count: { $sum: 1 }
}
}
])

The operation returns the following result:

{
"_id" : null,
"totalSaleAmount" : NumberDecimal("452.5"),
"averageQuantity" : 7.875,
"count" : 8
}

This aggregation operation is equivalent to the following SQL statement:

SELECT Sum(price * quantity) AS totalSaleAmount,
Avg(quantity) AS averageQuantity,
Count(*) AS Count
FROM sales
Tip
See also:

From the mongo shell, create a sample collection named books with the following documents:

db.books.insertMany([
{ "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 },
{ "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 },
{ "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 },
{ "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 },
{ "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }
])

The following aggregation operation pivots the data in the books collection to have titles grouped by authors.

db.books.aggregate([
{ $group : { _id : "$author", books: { $push: "$title" } } }
])

The operation returns the following documents:

{ "_id" : "Homer", "books" : [ "The Odyssey", "Iliad" ] }
{ "_id" : "Dante", "books" : [ "The Banquet", "Divine Comedy", "Eclogues" ] }

The following aggregation operation groups documents by author:

db.books.aggregate([
// First Stage
{
$group : { _id : "$author", books: { $push: "$$ROOT" } }
},
// Second Stage
{
$addFields:
{
totalCopies : { $sum: "$books.copies" }
}
}
])
First Stage:

$group uses the $$ROOT system variable to group the entire documents by authors. This stage passes the following documents to the next stage:

{ "_id" : "Homer",
"books" :
[
{ "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 },
{ "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }
]
},
{ "_id" : "Dante",
"books" :
[
{ "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 },
{ "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 },
{ "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }
]
}
Second Stage:

$addFields adds a field to the output containing the total copies of books for each author.

Note

The resulting documents must not exceed the BSON Document Size limit of 16 megabytes.

The operation returns the following documents:

{
"_id" : "Homer",
"books" :
[
{ "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 },
{ "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }
],
"totalCopies" : 20
}
{
"_id" : "Dante",
"books" :
[
{ "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 },
{ "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 },
{ "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }
],
"totalCopies" : 5
}
Tip
See also:

The Aggregation with the Zip Code Data Set tutorial provides an extensive example of the $group operator in a common use case.

Give Feedback

On this page

  • Definition
  • Considerations
  • Examples
  • Additional Resources