Navigation

    Configuring Data Lake

    You can configure Atlas Data Lake using the Data Lake Configuration. The configuration defines mappings between your data stores and Data Lake. To learn more about the configuration including the configuration fields and format, see Data Lake Configuration.

    You can retrieve and update the Data Lake configuration by connecting a mongo shell to the Data Lake:

    1. From the Atlas UI, select Data Lake from the left-hand navigation.
    2. Click Connect for the Data Lake to which you want to connect.
    3. Click Connect with the Mongo Shell.
    4. Follow the instructions in the Connect modal. If you already have the mongo shell installed, ensure you are running at least the latest stable release of the 3.6 shell.
    Info With Circle IconCreated with Sketch.Note

    Any MongoDB user in the Atlas project with the atlasAdmin role can retrieve and update the Data Lake configuration.

    Once connected to the Data Lake, you can use the following database commands to retrieve the Data Lake configuration:

    use admin
    db.runCommand( { "storageGetConfig" : 1 } )

    The command returns the current Data Lake configuration. For complete documentation on the configuration fields and format, see Configuration Format.

    Once connected to the Data Lake, you can use the following database commands to set or update the Data Lake configuration:

    use admin
    db.runCommand( { "storageSetConfig" : <config> } )

    Replace <config> with the Data Lake configuration. For complete documentation on the configuration fields and format, see Configuration Format. You can validate your configuration before setting or updating the Data Lake configuration by running the storageValidateConfig command.

    To set or update the storage configuration through the Atlas UI:

    1. Click Configuration for your Data Lake to view the Data Lake storage configuration.

      Image highlighting the Configuration button.
    2. Make changes to your storage configuration and click Save.

    You can run the following command to validate your Data Lake configuration.

    use admin
    db.runCommand( { "storageValidateConfig" : <config> } )

    Replace <config> with the Data Lake configuration. For complete documentation on the configuration fields and format, see Configuration Format.

    The command returns the following if your Data Lake configuration is valid:

    { "ok" : 1 }

    The command returns the list of errors in the errs field if your Data Lake storage configuration is invalid:

    {
    "ok" : 1,
    "errs" : [
    "<error>",
    "<error>",
    ...
    ]
    }

    You can run the storageGenerateConfig command to regenerate a Data Lake configuration. The command returns an automatically generated configuration, which you can then modify and upload. In the automatically generated configuration, Data Lake regenerates a database for each store:

    As a result, the databases array in the generated configuration might be different from the databases array in your existing configuration.

    Info With Circle IconCreated with Sketch.Note

    You must have the storageSetConfig privilege to run the storageGenerateConfig command. The atlasAdmin role has the storageSetConfig privilege by default.

    To generate a Data Lake configuration, connect to the Data Lake and run the following database commands:

    use admin
    db.runCommand( { "storageGenerateConfig" : 1 } )

    For complete documentation on the configuration fields and format, see Configuration Format.

    Give Feedback