Navigation

Verify Your Database and Collections

Beta

The Atlas Data Lake is available as a Beta feature. The product and the corresponding documentation may change at any time during the Beta stage. For support, see Atlas Support.

Estimated completion time: 15 minutes

When you first connect to your Data Lake, Atlas creates a default storage configuration that maps data stored in your S3 buckets to a set of databases and collections. This part of the tutorial walks you through verifying your databases and collections in the default storage configuration.

Prerequisites

To complete this part of the tutorial, you will need to have completed:

Procedure

1

Click Confirm in the third box.

Screenshot of confirm databases and collections in the UI.
2

Review the default storage configuration generated by Atlas.

The storage configuration maps a set of data stores to a set of databases and collections. A store is a set of objects in an S3 bucket under a specific prefix.

The following is an example of a storage configuration generated by Atlas:

{
  "databases": [
    {
      "name": "<your-database-name>",
      "collections": [
        {
          "name": "data",
          "dataSources": [
            {
              "path": "data.json",
              "storeName": "<your-store-name>"
            }
          ]
        },
        {
          "name": "listingsAndReviews",
          "dataSources": [
            {
              "path": "listingsAndReviews.json",
              "storeName": "<your-store-name>"
            }
          ]
        },
        {
          "name": "*",
          "dataSources": [
            {
              "path": "{collectionName()}",
              "storeName": "<your-store-name>"
            }
          ]
        }
      ],
      "views": []
    }
  ],
  "stores": [
    {
      "provider": "s3",
      "bucket": "<your-s3-bucket>",
      "delimiter": "/",
      "includeTags": false,
      "name": "<your-store-name>",
      "region": "<aws-s3-region>"
    }
  ]
}

By default, the database name and store name are the same as the S3 bucket.

note

When you dynamically generate collections from filenames, the number of collections is not accurately reported in the Data Lake view.

3

Change the database name and store name in the stores and databases arrays.

KeyTypeDescriptionExample
databases.namestringThe name of your database.sampleDB
databases.collections.dataSources.storeNamestringThe name of your store.s3store
stores.namestringThe name of your store.s3store

The following example storage configuration shows updated databases and stores arrays:

{
  "databases": [
    {
      "name": "sampleDB",
      "collections": [
        {
          "name": "data",
          "dataSources": [
            {
              "path": "data.json",
              "storeName": "s3store"
            }
          ]
        },
        {
          "name": "listingsAndReviews",
          "dataSources": [
            {
              "path": "listingsAndReviews.json",
              "storeName": "s3store"
            }
          ]
        },
        {
          "name": "*",
          "dataSources": [
            {
              "path": "{collectionName()}",
              "storeName": "s3store"
            }
          ]
        }
      ],
      "views": []
    }
  ],
  "stores": [
    {
      "provider": "s3",
      "bucket": "<your-s3-bucket>",
      "delimiter": "/",
      "includeTags": false,
      "name": "s3store",
      "region": "<aws-s3-region>"
    }
  ]
}
4

Click Save to save the changes to the storage configuration.

5

Verify your database and collection mapping.

  1. Connect to your Data Lake with the bin.mongo shell.
  2. Run the following command to display the mapped database:

    show dbs

    Upon successful configuration, the mongo shell outputs the following:

    sampleDB
  3. Switch to the sampleDB database:

    use sampleDB
  4. Run the following command to display the mapped collections:

    show collections

    Upon successful configuration, the mongo shell outputs the following:

    data
    listingsAndReviews

Next Steps

Now that you mapped your data store to Data Lake databases and collections, we're ready to run some queries. Proceed to Run Queries Against Your Data Lake.