Navigation

Create One Data Lake

note

Groups and projects are synonymous terms. Your {GROUP-ID} is the same as your project ID. For existing groups, your group/project ID remains the same. The resource and corresponding endpoints use the term groups.

The Atlas API uses HTTP Digest Authentication. Provide a programmatic API public key and corresponding private key as the username and password when constructing the HTTP request.

For complete documentation on configuring API access for an Atlas project, see Configure Atlas API Access.

Base URL

https://cloud.mongodb.com/api/atlas/v1.0

Use this endpoint to create a specific Atlas Data Lake associated to an Atlas project.

Syntax

POST /groups/{GROUP-ID}/dataLakes

Request Path Parameters

Path ElementRequired/OptionalDescription
GROUP-IDRequired.The unique identifier for the project.

Request Query Parameters

The following query parameters are optional:

Query ParameterTypeDescriptionDefault
prettybooleanDisplays response in a prettyprint format.false
envelopebooleanSpecifies whether or not to wrap the response in an envelope.false

Request Body Parameters

FieldRequired/OptionalDescription
nameRequiredName of the Atlas Data Lake.

Response

NameTypeDescription
cloudProviderConfigobjectConfiguration information related to the cloud service where Atlas Data Lake source data is stored.
cloudProviderConfig.<provider>object

Name of the provider of the cloud service where Data Lake can access the S3 Bucket data stores.

Data Lake only supports aws.

cloudProviderConfig.aws. externalIdstring

Unique identifier generated by Atlas and associated to the created Data Lake.

Atlas requires an IAM Role with read-only access to the S3 buckets that you will associate with the data store. You must specify this value as the sts.ExternalId when defining that role's trust policy.

important

Atlas displays this value only once as part of the response body. You cannot retrieve this value outside of the response body.

cloudProviderConfig.aws. iamAssumedRoleARNstring

Amazon Resource Name (ARN) of the IAM Role that Data Lake assumes when accessing the AWS S3 bucket associated with the data store.

The initial state of iamAssumedRoleARN is null. After creating the required IAM Role in AWS , specify it to this field when performing the update operation.

The IAM Role must support the following actions against each S3 bucket:

  • s3:GetObject
  • s3:ListBucket
  • s3:GetObjectVersion

For more information on S3 actions, see Actions, Resources, and Condition Keys for Amazon S3.

cloudProviderConfig.aws. iamUserARNstring

The Amazon Resource Name (ARN) of an Atlas IAM user associated with the project. Data Lake assumes the IAM role specified using the cloudProviderConfig.aws.iamAssumedRoleARN with this user to access the data store S3 bucket.

You must specify the iamUserARN as part of the Principal.aws field when defining the iamAssumedRoleARN role's trust policy.

important

Atlas displays this value only once as part of the response body. You cannot retrieve this value outside of the response body.

dataProcessRegionOptional

The cloud provider region to which Atlas Data Lake routes client connections for data processing.

The default value null directs Atlas Data Lake client connections to the region nearest to the client based on DNS resolution.

Use the update endpoint to update the data store configuration with a specific dataProcessRegion.

groupIdstringThe unique identifier for the project.
hostnamesarrayThe list of hostnames assigend to the Atlas Data Lake. Each string in the array is a hostname assigned to the Atlas Data Lake.
namestringName of the Atlas Data Lake.
statestring

Current state of the Atlas Data Lake. The intial state after creation is always UNVERIFIED.

Use the update endpoint to update the data store configuration with the required settings. For cloudProviderConfig.aws, this requires setting the cloudProviderConfig.aws.iamAssumedRoleARN to an IAM role that grants access to the S3 buckets associated with any data stores.

storageobjectConfiguration details for each data store and its mapping to MongoDB database(s) and collection(s).
storage.databasesobject

Mapping configuration for the data store.

The initial state of this field is an empty document {}.

storage.storesarray

Each object in the array represents a storage resource associated with the data store.

The initial state of this field is an empty array [].

Example

Request

curl -u "{PUBLIC-KEY}:{PRIVATE-KEY}" --digest \
 --header "Accept: application/json" \
 --header "Content-Type: application/json" \
 --request POST "https://cloud.mongodb.com/api/atlas/v1.0/groups/{GROUP-ID}/dataLakes?pretty=true" \
 --data '{ "name" : "UserMetricData" }'

Response

{
  "cloudProviderConfig": {
    "aws": {
      "externalId" : "12a3bc45-de6f-7890-12gh-3i45jklm6n7o",
      "iamAssumedRoleARN": null
      "iamUserARN": "arn:aws:iam::1234567890123:user/queryengine"
    }
  },
  "dataProcessRegion": null,
  "groupId": "1ab23c4567def890gh12ij34",
  "hostnames": [
    "hardwaremetricdata.mongodb.example.net"
  ],
  "name": "UserMetricData",
  "state": "UNVERIFIED",
  "storage": {
    "databases": {},
    "stores": []
  }
}