Navigation

$out (aggregation)

New in version 2.6.

Definition

$out

Takes the documents returned by the aggregation pipeline and writes them to a specified collection. The $out operator lets the aggregation framework return result sets of any size.

Important

Syntax

Note

MongoDB 4.2 adds a new syntax structure that implements expanded functionality and flexibility around merging aggregation pipeline results into a target collection, including support for sharded collections and output modes that preserve the existing collection data.

MongoDB 4.2 continues support for the legacy $out syntax, which does not support any of the new functionality of the updated $out syntax.

The. $out stage can have one of the following syntaxes:

  • New in version 4.2:
    {
       $out:
         {
           to : "<output-collection>",
           mode : "insertDocuments" | "replaceDocuments" | "replaceCollection",
           db : "<output-database>",
           uniqueKey: <document>
         }
    }
    

    The $out takes a document with the following fields:

    Field Description
    to

    Specifies the target collection.

    If the mode is replaceDocuments or insertDocuments, the target collection can only appear in the $out stage of the aggregation pipeline; i.e. the other stages in the pipeline must not access the collection.

    To output to a sharded collection:

    • The collection must already exist.
    • The mode must be replaceDocuments or insertDocuments.
    mode

    Specifies the mode for merging the aggregation pipeline output with the target collection:

    Mode Description
    replaceCollection

    $out replaces the to collection wholesale with the output from the aggregation pipeline. The to collection cannot be in a different database than the aggregation in this mode.

    Specifically, it writes documents to a temporary collection. If the to collection exists, it copies over indexes and collection options. Finally, $out renames the temporary collection to the to collection name.

    This is the same functionality as the old syntax.

    replaceDocuments $out writes documents directly to the to collection, performing a replacement-style update on any document in the to collection with the same uniqueKey pattern as the document from the pipeline.
    insertDocuments $out writes documents directly to the to collection, raising an error if a document exists in the to collection with the same uniqueKey pattern as one being written.

    To learn more about the behavior of mode, see Behaviors.

    db Optional. Specifies the database of the to collection. Defaults to the database against which the aggregation was run. In replaceCollection mode this must be the same database as the aggregation.
    uniqueKey

    Optional. Specifies the fields that uniquely identify a document in the to collection for replacement or merge. Defaults to the document key { _id : 1 } for an unsharded to collection or _id and all the fields of the shard key for a sharded to collection. All fields must have a value of 1. Duplicate keys are forbidden.

    If the to collection is sharded, the uniqueKey must contain all the of the fields of the shard key. If the output collection does not exist, uniqueKey must be { _id : 1 }, the default uniqueKey.

    There must exist a unique index that covers the uniqueKey, for any ordering of the uniqueKey fields.

    This unique index:

    See $out Collision Detection with uniqueKey for more details.

  • New in version 2.6: The $out stage also has the following form:

    { $out: "<output-collection>" }
    

    It is an alias with the same effect as:

    {$out: {to: "<output-collection>", mode: "replaceCollection"}}
    

Behaviors

Replacing a Collection with the Aggregation Pipeline Results

The following behavior applies when specifying the replaceCollection mode (or using the older syntax).

Important

This mode is prohibited if the target collection is sharded or in a different database than the aggregation, as it can only create new unsharded collections in the aggregation database. To use a sharded collection or a collection in a different database than the aggregation, use either the replaceDocuments or insertDocuments mode.

If the output collection does not exist, the $out operation creates a new unsharded collection in the same database as the aggregation. This collection is not visible until the aggregation completes. If the aggregation fails, MongoDB does not create the collection.

If the output collection already exists, then upon completion of the aggregation the $out stage atomically replaces the existing collection with the new results collection. Specifically, the $out operation:

  1. Creates a temp collection.
  2. Copies the indexes from the existing collection to the temp collection.
  3. Inserts the documents into the temp collection.
  4. Calls db.collection.renameCollection with dropTarget: true to rename the temp collection to the destination collection.

The $out operation does not change any indexes that existed on the previous collection. If the aggregation fails, the $out operation makes no changes to the pre-existing collection.

Using $out to Replace Documents in an Existing Collection

The following behavior applies when specifying the replaceDocuments mode.

If the output collection doesn’t already exist, the $out operation creates a new unsharded collection in the db database. Otherwise, it uses the existing collection, which can be sharded. The $out stage writes each document in the aggregation pipeline result set directly to the to collection.

If an existing document has the same uniqueKey pattern as a document from the $out stage, the existing document is overwritten with a replacement-style update. If the $out stage produces multiple documents with the same uniqueKey pattern, it is indeterminate which of those multiple documents is stored for that uniqueKey pattern.

If there is no existing document with the same uniqueKey pattern, $out inserts the document instead.

Using $out to Insert Documents into an Existing Collection

The following behavior applies when specifying the insertDocuments mode.

If the output collection doesn’t already exist, the $out operation creates a new unsharded collection in the db database. Otherwise, it uses the existing collection, which can be sharded. The $out stage writes each document in the aggregation pipeline result set directly to the to collection.

If an existing document has the same uniqueKey pattern as a document from the $out stage, the pipeline raises an error and stops execution. The exact stop point is unspecified. Some inserts may have completed and will not roll back.

Index Constraints

The pipeline will fail to complete if the documents produced by the pipeline would violate any unique indexes, including the index on the _id field of the original output collection.

$out Generates the _id Field

If the _id field is not present in a document from the aggregation pipeline results at the time of writing, the $out stage generates it automatically.

$out Collision Detection with uniqueKey

The $out stage uses the uniqueKey to check for collison between documents to replace them (for replaceDocuments) or to throw an error (for insertDocuments). At plan time, MongoDB verifies that there exists an index that covers the uniqueKey, for any ordering of the uniqueKey fields, and fulfills the previously listed requirements.

Important

If the unique index used by uniqueKey is dropped mid-aggregation, there is no guarantee that the aggregation will be killed. If the aggregation continues, there is no guarantee that at most one document in the collection will match a uniqueKey pattern.

Other than the _id field, all fields in the uniqueKey must exist in the output documents and must not be null or an array. If one of the uniqueKey fields is or is inside an array, use $unwind to transform it.

Transactions

$out is not allowed in transactions.

Views

New in version 4.2.

The $out stage is not allowed as part of a view definition.

linearizable Read Concern

Starting in MongoDB 4.2, the $out stage cannot be used in conjunction with read concern "linearizable". That is, if you specify "linearizable" read concern for db.collection.aggregate(), you cannot include the $out stage in the pipeline.

Examples

replaceCollection

A collection books contains the following documents:

{ "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 }
{ "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 }
{ "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }
{ "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 }
{ "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }

The following aggregation operation pivots the data in the books collection to have titles grouped by authors and then writes the results to the authors collection.

db.books.aggregate( [
                      { $group : { _id : "$author", books: { $push: "$title" } } },
                      { $out : { to : "authors", mode : "replaceCollection" } }
                  ] )

This aggregation is equivalent to:

db.books.aggregate( [
                      { $group : { _id : "$author", books: { $push: "$title" } } },
                      { $out : "authors" }
                  ] )

After the operation, the authors collection contains the following documents:

{ "_id" : "Homer", "books" : [ "The Odyssey", "Iliad" ] }
{ "_id" : "Dante", "books" : [ "The Banquet", "Divine Comedy", "Eclogues" ] }

replaceDocuments

A collection phonebook holds employee names and phone numbers for a company’s employees. It contains the following documents:

{ "_id" : "Susan", "phone" : "55555" }
{ "_id" : "Bob", "phone" : "12121" }
{ "_id" : "Anna", "phone" : "33344" }

A collection employees has:

{ "_id" : 8, "name": "Susan", "phone" : "55555", "title" : "developer", "updated" : 2016 }
{ "_id" : 9, "name": "Bob", "phone" : "12121", "title" : "developer", "updated" : 2017 }
{ "_id" : 10, "name": "Jake", "phone" : "63636", "title" : "intern", "updated" : 2018 }
{ "_id" : 11, "name": "Anna", "phone" : "88888", "title" : "lead", "updated" : 2018 }

The following aggregation pipeline:

  • Matches all users whose updated field is greater than or equal to 2018
  • Sorts by the document _id number
  • Groups by their name and the $last phone value associated to that user
  • Outputs the result set to phonebook using the replaceDocuments mode.
db.employees.aggregate( [
                      { $match : { updated:  { $gte : 2018 } } },
                      { $sort : { _id : 1 } },
                      { $group : { _id : "$name", phone: {$last: "$phone" } } },
                      { $out : { to : "phonebook", mode : "replaceDocuments" } }
                  ] )

uniqueKey is unspecified, so the pipeline uses the default value of { _id : 1 } to identify documents for replacement. As _id is redefined to name, Anna matches to Anna. After the operation, the phonebook collection contains the following documents:

{ "_id" : "Susan", "phone" : "55555" }
{ "_id" : "Bob", "phone" : "12121" }
{ "_id" : "Anna", "phone" : "88888" }
{ "_id" : "Jake", "phone" : "63636" }

Where Anna has been replaced and Jake is added.

insertDocuments and uniqueKey

A collection tickets contains the following documents:

{_id: "Bill", seat: 12, cost: 20, daySold: 14}
{_id: "Sarah", seat: 2, cost: 25, daySold: 16}

A sharded collection seats is sharded on { seat : 1 } and has the unique index { seat : 1 }. It contains the following documents:

{_id: "Bill", seat: 12}
{_id: "Sarah", seat: 2}

Two more tickets are sold and added to the tickets collection, such that the tickets collection now holds the following documents:

{_id: "Bill", seat: 12, cost: 20, daySold: 14}
{_id: "Sarah", seat: 2, cost: 25, daySold: 16}
{_id: "Jake", seat: 5, cost:25, daySold: 17}
{_id: "Beth", seat: 8, cost: 20, daySold: 19}

The following aggregation pipeline:

  • Matches all users whose daySold field is greater than 16.
  • projects the _id and seat to get each person and their assigned seat.
  • Outputs the result set to seats using the insertDocuments mode.

The uniqueKey of { seat : 1} makes sure that no seat gets assigned twice.

db.tickets.aggregate( [
                    { $match : { daySold : { $gt : 16 } } },
                    { $project: { seat: 1 } },
                    { $out : { to : "seats", mode : "insertDocuments",
                      uniqueKey: { seat : 1 } } }
                ] )

After the operation, the seats collection contains the following documents:

{_id: "Bill", seat: 12}
{_id: "Sarah", seat: 2}
{_id: "Jake", seat: 5}
{_id: "Beth", seat: 8}