Navigation

Partition Attribute Types

Info With Circle IconCreated with Sketch.Note

When specifying the path:

  • Specify the data type for the partition attribute.
  • Ensure that the partition attribute type matches the data type to parse.
  • Use the delimiter specified in delimiter.

The following table lists the supported data types for partition attributes, filename, and path example for each data type:

KeyData TypeExample
stringParses the filename as a string.

filename: /employees/949-555-0195.json

path: /employees/{phone string}

In the preceding example, Data Lake interprets phone as a string.

Bulb IconTip
See also:
intParses the filename as an integer.

filename: /zipcodes/90210.json

path: /zipcodes/{zipcode int}

In the preceding example, Data Lake interprets zipcode as an integer.

isodateParses the filename in RFC 3339 format as an ISO-8601 format date.

filename: /metrics/2019-01-03T00:00:00Z.json

path: /metrics/{startTimestamp isodate}

In the preceding example, Data Lake interprets startTimestamp as an ISODate. Partitions with the following date formats are also supported by the ISODate attribute:

"2020-01-02T15:04:05Z07:00"
"2020-01-02T15:04:05.000000Z07:00"
"2020-01-02"
"2020-01-02T15:04:05.000000-0700"
"2020-01-02T15:04:05-0700"
"2020-01-02T15:04Z07:00"
"2020-01-02T15:04-0700"
"2020-01-02Z07:00"
"2020-01-02-0700"
"20200102T15:04:05.000000Z07:00"
"20200102T15:04:05.000000-0700"
"20200102T15:04:05Z07:00"
"20200102T15:04:05-0700"
"20200102T15:04Z07:00"
"20200102T15:04-0700"
"20200102Z07:00"
"20200102-0700"
"20200102"
epoch_secsParses the filename as a Unix timestamp in seconds.

filename: /metrics/1549046112.json

path: /metrics/{startTimestamp epoch_secs}

In the preceding example, Data Lake interprets startTimestamp as a Unix timestamp in seconds.

epoch_millisParses the filename as a Unix timestamp in milliseconds.

filename: /metrics/1549046112000.json

path: /metrics/{startTimestamp epoch_millis}

In the preceding example, Data Lake interprets startTimestamp as a Unix timestamp in milliseconds.

objectidParses the filename as an ObjectId.

filename: /metrics/507f1f77bcf86cd799439011.json

path: /metrics/{objid objectid}

In the preceding example, Data Lake interprets objid as an ObjectId.

uuidParses the filename as a UUID of binary subtype 4.

filename: /metrics/3b241101-e2bb-4255-8caf-4136c566a962.json

path: /metrics/{myUuid uuid}

In the preceding example, Data Lake interprets myUuid as a UUID of binary subtype 4.

Info With Circle IconCreated with Sketch.Note

Atlas Data Lake supports the Package Syntax for regular expressions in the path to the filename.

Atlas Data Lake converts the partition attributes to BSON types when parsing the path to the filename. Later writes of data to S3 must use the BSON types after converting them to string. The following table shows:

  • The partition attribute types and the BSON types to which Data Lake converts them.
  • The BSON data type to convert to a string for later writes to S3 .
Partition Attribute TypeParsed BSON TypeSource BSON Type
string
  • UTF-8 string
  • null*
  • UTF-8 string
  • null
int
  • 64-bit integer
  • null
  • 32-bit integer
  • 64-bit integer
  • null (as strings with no padding)
isodate
  • UTC datetime
  • null
  • UTC datetime (as an ISO-8601 format string)
  • null
objectid
  • ObjectId (as a string with hex encoding)
  • null
uuid
Give Feedback