Navigation

Configuring Data Lake

On this page

  • Overview
  • Retrieve Data Lake Configuration
  • Set or Update Data Lake Configuration
  • Validate Data Lake Configuration
  • Generate Data Lake Configuration

You can configure Atlas Data Lake using the Data Lake Configuration. The configuration defines mappings between your data stores and Data Lake. To learn more about the configuration including the configuration fields and format, see Data Lake Configuration.

You can retrieve and update the Data Lake configuration by connecting a mongo shell to the Data Lake. You can also update your Data Lake from the Atlas UI:

  1. From the Atlas UI, select Data Lake from the left-hand navigation.
  2. Click Configuration for the Data Lake that you want to update.
  3. Make necessary changes to the storage configuration and click Save for the changes to take effect.
Note

Any MongoDB user in the Atlas project with the atlasAdmin role can retrieve and update the Data Lake configuration.

Once connected to the Data Lake, you can use the following database commands to retrieve the Data Lake configuration:

use admin
db.runCommand( { "storageGetConfig" : 1 } )

The command returns the current Data Lake configuration. For complete documentation on the configuration fields and format, see Configuration Format.

Once connected to the Data Lake, you can use the following database commands to set or update the Data Lake configuration:

use admin
db.runCommand( { "storageSetConfig" : <config> } )

Replace <config> with the Data Lake configuration. For complete documentation on the configuration fields and format, see Configuration Format. You can validate your configuration before setting or updating the Data Lake configuration by running the storageValidateConfig command.

To set or update the storage configuration through the Atlas UI:

  1. Click Configuration for your Data Lake to view the Data Lake storage configuration.

    Image highlighting the Configuration button.
  2. Make changes to your storage configuration and click Save.

You can run the following command to validate your Data Lake configuration.

use admin
db.runCommand( { "storageValidateConfig" : <config> } )

Replace <config> with the Data Lake configuration. For complete documentation on the configuration fields and format, see Configuration Format.

The command returns the following if your Data Lake configuration is valid:

{ "ok" : 1 }

The command returns the list of errors in the errs field if your Data Lake storage configuration is invalid:

{
"ok" : 1,
"errs" : [
"<error>",
"<error>",
...
]
}

You can run the storageGenerateConfig command to regenerate a Data Lake configuration. The command returns an automatically generated configuration, which you can then modify and upload. In the automatically generated configuration, Data Lake regenerates a database for each store:

As a result, the databases array in the generated configuration might be different from the databases array in your existing configuration.

Note

You must have the storageSetConfig privilege to run the storageGenerateConfig command. The atlasAdmin role has the storageSetConfig privilege by default.

To generate a Data Lake configuration, connect to the Data Lake and run the following database commands:

use admin
db.runCommand( { "storageGenerateConfig" : 1 } )

For complete documentation on the configuration fields and format, see Configuration Format.

Give Feedback

On this page

  • Overview
  • Retrieve Data Lake Configuration
  • Set or Update Data Lake Configuration
  • Validate Data Lake Configuration
  • Generate Data Lake Configuration