Docs Home → View & Analyze Data → MongoDB Atlas Data Lake

Atlas Data Lake

On this page

About Atlas Data Lake

Sample Uses
Atlas Data Lake Regions
Billing

Looking for documentation for what used to be called "Atlas Data Lake"? Atlas Data Lake is now called Atlas Data Federation. To learn more about the renamed federated query engine service, see Atlas Data Federation.

About Atlas Data Lake

MongoDB Atlas Data Lake is now an analytic-optimized object storage service for extracted data. Atlas Data Lake provides an analytic storage service optimized for flat or nested data with low latency query performance.

Prerequisites

Atlas Data Lake requires an M10 or higher backup-enabled Atlas cluster with cloud backup jobs running on a specified cadence. To learn more about cloud backups, see Back Up Your Database Deployment.

Supported Types of Data Source

Atlas Data Lake supports collection snapshots from Atlas clusters as a data source for extracted data. Atlas Data Lake automatically ingests data from the snapshots, and partitions and stores data in an analytics-optimized format. It doesn't support creating pipelines for Views.

Data Storage Format and Query Support

Atlas Data Lake stores data in an analytic oriented format that is based on open source standards with support for polymorphic data. Data is fully managed, partition level indexed, and balanced as data grows. Atlas Data Lake optimizes data extraction for analytic type queries. When Atlas Data Lake extracts new data, it re-balances existing files to ensure consistent performance and minimize data scan.

Atlas Data Lake stores data in a format that best fits its structure to allow for fast point-queries and aggregate queries. For point-queries, Atlas Data Lake's storage format improves performance by finding partitions faster. Aggregate type queries only scan the column required to provide results. Additionally, Atlas Data Lake partition indexes improve performance for aggregate queries by returning results directly from the partition index without needing to scan underlying files.

Sample Uses

You can use Atlas Data Lake to:

Isolate analytical workloads from your operational cluster.
Provide a consistent view of cluster data from a snapshot for long running aggregations using $out.
Query and compare across versions of your cluster data at different points in time.

Atlas Data Lake Regions

Atlas Data Lake provides optimized storage in the following AWS regions:

Data Lake Regions	AWS Regions
Virginia, USA	us-east-1
Oregon, USA	us-west-2
Sao Paulo, Brazil	sa-east-1
Ireland	eu-west-1
London, England	eu-west-2
Frankfurt, Germany	eu-central-1
Mumbai, India	ap-south-1
Singapore	ap-southeast-1
Sydney, Australia	ap-southeast-2

Atlas Data Lake automatically selects the region closest to your Atlas cluster for storing ingested data.

Billing

You incur Atlas Data Lake charges per GB per month based on the AWS region where the ingested data is stored. You incur Atlas Data Lake costs for the following items:

Ingestion of data from your data source
Storage on the cloud object storage

Extraction Costs

Atlas Data Lake charges you for the resources utilized to extract, upload, and transfer data. Atlas Data Lake charges for the snapshot export operations is based on the following:

Cost per GB for snapshot extraction
Cost per hour on the AWS server for snapshot export download
Cost per GB per hour for snapshot export restore storage
Cost per IOPS per hour for snapshot export storage IOPS

Storage Costs

Atlas Data Lake charges for storing and accessing stored data is based on the following:

Cost per GB per day
Cost for every one thousand storage access requests when querying Data Lake datasets using Atlas Data Federation. Each access request corresponds to a partition of data from a Data Lake dataset that Atlas Data Federation fetches to process for a query.
Note
You can now set limits on the amount of data that Atlas Data Federation processes for your queries to control costs. To learn more, see Manage Atlas Data Federation Query Limits.

To learn more, see the Atlas pricing page.

Get Started →

Atlas Data Lake.css-134mg1q{-webkit-align-self:center;-ms-flex-item-align:center;align-self:center;padding:0 10px;visibility:hidden;}.css-6vrlzm{border-radius:0!important;display:initial!important;margin:initial!important;}.css-1l4s55v{margin-top:-175px;position:absolute;padding-bottom:2px;}