Navigation

Run Queries Against Your Data Lake

Estimated completion time: 5 minutes

You can run operations using the MongoDB Query Language (MQL) which includes most, but not all standard server commands. To learn which MQL operations are supported, see the MQL Support documentation.

Note

The Atlas Data Lake sample datasets are read-only.

To complete this part of the tutorial, you will need to have completed:

You must be connected to your Data Lake with the mongo shell before running the following queries.

Before running the queries, switch to Database0 database:

use Database0

The following queries use the paths that you added to your Data Lake during deployment. If you added all the paths to the sample datasets, you can run all the queries in the following tabs. If you only added some paths, click and run the queries in the appropriate tabs for the sample dataset paths that you added.

Find the number of AirBnB offerings with 3 bedrooms and a high review score:

db.Collection0.aggregate([{$match: {"bedrooms" : 3, "review_scores.review_scores_rating": {$gt: 79}} },{ $count: "numProperties"}])

Find properties with 3 bedrooms and sort the returned documents by customer review rating. Limit the number of documents returned to 5:

db.Collection0.find({"bedrooms" : 3} ).sort({review_scores_rating: -1}).limit(5)

Congratulations! You just set up an Atlas Data Lake, created a database and collections from data stored in an S3 bucket, and queried the data using MQL commands.

For more information on Atlas Data Lake, see Atlas Data Lake.

Screenshot of the Data Lake after running queries.
Note

When you dynamically generate collections from filenames, the number of collections is not accurately reported in the Data Lake view.

Give Feedback