Navigation

    Configuring Data Lake

    Beta

    The Atlas Data Lake is available as a Beta feature. The product and the corresponding documentation may change at any time during the Beta stage. For support, see Atlas Support.

    Overview

    You can configure Atlas Data Lake using the Data Lake Configuration File. The configuration file defines mappings between your data stores and Data Lake. To learn more about the configuration file including the configuration fields and file format, see Data Lake Configuration File.

    You can retrieve and update the Data Lake configuration by connecting a bin.mongo shell to the Data Lake:

    1. From the Atlas UI, select Data Lake from the left-hand navigation.
    2. Click Connect for the Data Lake to which you want to connect.
    3. Click Connect with the Mongo Shell.
    4. Follow the instructions in the Connect modal. If you already have the bin.mongo shell installed, ensure you are running at least the latest stable release of the 3.6 shell.

    note

    Any MongoDB user in the Atlas project with the atlasAdmin role can retrieve and update the Data Lake configuration.

    Retrieve Data Lake Configuration

    Once connected to the Data Lake, you can use the following database commands to retrieve the Data Lake configuration:

    use admin
    db.runCommand( { "storageGetConfig" : 1 } )

    The command returns the current Data Lake configuration. For complete documentation on the configuration fields and file format, see Configuration File Format.

    Set or Update Data Lake Configuration

    Once connected to the Data Lake, you can use the following database commands to set or update the Data Lake configuration:

    use admin
    db.runCommand( { "storageSetConfig" : <config> } )

    Replace <config> with the Data Lake configuration file. For complete documentation on the configuration fields and file format, see Configuration File Format.

    To set or update the storage configuration through the Atlas UI:

    1. Click Configuration for your Data Lake to view the Data Lake storage configuration.

      Image highlighting the Configuration button.
    2. Make changes to your storage configuration and click Save.

    Generate Data Lake Configuration

    You can run the storageGenerateConfig command to regenerate a Data Lake configuration. The command returns an automatically generated configuration, which you can then modify and upload. In the automatically generated configuration, Data Lake regenerates a database for each store:

    As a result, the databases array in the generated configuration might be different from the databases array in your existing configuration.

    note

    You must have the storageSetConfig privilege to run the storageGenerateConfig command. The atlasAdmin role has the storageSetConfig privilege by default.

    To generate a Data Lake configuration, connect to the Data Lake and run the following database commands:

    use admin
    db.runCommand( { "storageGenerateConfig" : 1 } )

    For complete documentation on the configuration fields and file format, see Configuration File Format.