$lookup
¶
$lookup
performs a left outer join to an unsharded collection in the
same database to filter in documents from the “joined” collection for
processing. For more information, see
$lookup. In
Atlas Data Lake, $lookup
can be used to perform a join of collections
from different databases.
Syntax¶
The $lookup
syntax is
described in the MongoDB server manual. In Data Lake, the from
field in $lookup
has the following alternate syntax to support specifying
an object containing an optional database name and a required collection name:
{ $lookup: { localField: "<fieldName>", from: <collection-to-join>|{db: <db>, coll: <collection-to-join>}, foreignField: "<fieldName>", as: "<output-array-field>", } }
from
Field Object¶
Field | Type | Description | Necessity |
---|---|---|---|
db | string | The database name. If the database name is specified, data is read from the collection in the specified database. If you specify a database name that is different from the database upon which the command is operating, all nested $lookup stages must also specify a database name. If the database name is not specified within a $lookup stage, collections in the stage inherit the database name specified in the closest parent $lookup stage if it exists or the database upon which the command is operating. | Conditional |
coll | string | The collection name. | Required |
Examples¶
Suppose there are three databases named sourceDB1
, sourceDB2
, and
sourceDB3
with the following collections:
db.orders.insert([ { "_id" : 1, "item" : "almonds", "price" : 12, "quantity" : 2 }, { "_id" : 2, "item" : "pecans", "price" : 20, "quantity" : 1 }, { "_id" : 3 } ])
The following examples use the $lookup aggregation stage to join documents from one collection with the documents from the collection in the other databases.
Basic Example¶
The following aggregation operation on the sourceDB1.orders
collection joins the documents from the orders
collection with the documents
from the sourceDB2.catalog
collection using the item
field from the orders
collection and the sku
field from the catalog
collection:
db.getSiblingDb("sourceDB1").orders.aggregate( { $lookup: { from: { db: "sourceDB2", coll: "catalog" }, localField: "item", foreignField: "sku", as: "inventory_docs" } } )
Nested Example¶
The following aggregation operation on the sourceDB1.orders
collection joins the documents from the orders
collection with the documents from the sourceDB2.catalog
collection and the documents from the sourceDB3.warehouses
collection using the item
field from the orders
collection, the sku
field from the catalog
collection, and the stock_item
and instock
fields from the warehouses
collection:
db.getSiblingDb(“sourceDB1”).orders.aggregate( [ { $lookup: { from: db: "sourceDB2", coll: "catalog", let: { "order_sku": "$item" }, pipeline: [ { $match: { $expr: { $eq: ["$sku", "$$order_sku"] } } }, { $lookup: { from: db: "sourceDB3", coll: "warehouses", pipeline: [ { $match: { $expr:{ $eq : ["$stock_item", "$$order_sku"] } } }, { $project : { "instock": 1, "_id": 0} } ], as: "wh" } }, { "$unwind": "$wh" }, { $project : { "description": 1, "instock": "$wh.instock", "_id": 0} } ], as: "inventory" }, }, ] )