Navigation

    Atlas Data Lake

    MongoDB Atlas Data Lake allows you to natively query and analyze data across AWS S3 and MongoDB Atlas. You can query your richly structured data stored in JSON , BSON , CSV, TSV, Avro, ORC, and Parquet formats using the mongo shell, MongoDB Compass, or any MongoDB driver without data movement or transformation.

    When you create a Data Lake, you grant Atlas either read only or read and write access to S3 buckets in your AWS account. To access your Atlas clusters, Atlas uses your existing Role Based Access Controls. You can view and edit the generated data storage configuration that maps data from your S3 buckets and Atlas clusters to virtual databases and collections.

    A database user must have one of the following roles to query an Atlas Data Lake:

    Verify that you meet the following prerequisites before you create a Data Lake:

    • One or more AWS S3 buckets in the same AWS account.
    • An AWS CLI configured to access your AWS account. Alternatively, you must have access to the AWS Management Console with permission to create IAM roles.

    Atlas Data Lake routes your Data Lake requests through one of the following regions:

    Data Lake Regions
    AWS Regions
    Northern Virginia, North America
    us-east-1
    Oregon, North America
    us-west-2
    Ireland, Europe
    eu-west-1
    London, Europe
    eu-west-2
    Frankfurt, Europe
    eu-central-1
    Sydney, Australia
    ap-southeast-2
    Note

    You will incur charges when running Atlas Data Lake queries. For more information, see Billing below.

    Atlas Data Lake incurs costs for the amount of data processed and returned by the service.

    Atlas charges for the total number of bytes that Data Lake processes from your AWS S3 buckets, rounded up to the nearest megabyte. Atlas charges $5.00 per TB of processed data, with a minimum of 10 MB or $0.00005 per query.

    You can use partitioning strategies and compression in AWS S3 to reduce the amount of data processed.

    Atlas charges for the total number of bytes returned by Data Lake. This total is the sum of the following data transfers:

    • The number of bytes transferred between Data Lake service nodes
    • The number of bytes transferred from Data Lake to the client

    Returned data is billed as outlined in the Data Transfer Fees section of the Atlas pricing page. The cost of data transfer depends on the Cloud Service Provider charges for same-region, region-to-region, or region-to-internet data transfer.

    Give Feedback