Schema Configuration¶
On this page
Business Intelligence tools connect to a data source and, given a fixed tabular schema, allow the user to visually explore their data. However, because MongoDB uses a flexible schema, these tools currently cannot use MongoDB as a native data source.
To use MongoDB with a relational business intelligence tool such as Tableau, you must define a relational mapping for your MongoDB data.
The schema generation tool mongodrdl
will output such a schema in the
Document Relational Definition Language (DRDL) format, but you
should verify that the schema it proposes matches your data requirements.
Examine both the table structure and field types to ensure that you will be
able to generate your desired reports.
You may have to edit the schema definition files by hand to remove unneeded
fields; add fields that mongodrdl
did not discover within the subset
of documents that it sampled; and modify erroneous field types.
You may edit DRDL files in the following ways:
- Add aggregation pipeline stages,
- Add fields,
- Remove fields,
- Remove tables,
- Rename tables, so long as the
collection
field remains the same, - Rename fields using the
sqlname
option
New in version 1.1: Rename fields using the sqlname
option.
Document Relational Definition Language¶
The Document Relational Definition Language (DRDL
) defines a relational
view of a MongoDB schema.
mongodrdl
produces these files using one or more MongoDB collections
as a guideline, and mongobischema
imports them into the Business
Intelligence Connector.
Field Types¶
The MongoDB Connector for BI will correctly map fields that always contain the same data type into the relational model. Schema generation will deal specially with the following cases:
- Dates
- The connector will treat any field of type
data_date
as the SQLtimestamp
type. - Timestamps
- The connector will treat any field of type
data_timestamp
as avarchar
string type. - Geospatial
New in version 1.1.0.
If a collection contains a geospatial index, the connector will map the indexed field into an array of numeric longitude-latitude coordinates.
See Geospatial Data for an example.
- Heterogeneous Fields
If a field can contain multiple types,
mongodrdl
will pick the type that appears most frequently. If a value does not match the type specified in the DRDL schema, the MongoDB Connector for BI will report an error. However, in the case where schema specifies a string, the MongoDB Connector for BI can convert the value into a string.If a field can contain either a string or an array of strings, the generated schema will always specify that the field contains an array of strings.
Embedded Documents¶
The MongoDB Connector for BI will map embedded documents to simple fields
that have a .
separator character, making them appear similar to the way you
would reference them using dot notation in a
MongoDB query.
While Tableau properly quotes identifiers, within “ad-hoc”
SQL expressions you must double-quote every identifier that contains .
characters or mixed-case characters.
For example, consider the following document:
This will result in the following fields in the generated schema:
_id |
numeric |
familyName |
varchar |
hometown |
varchar |
address.street |
varchar |
address.city |
varchar |
address.state |
varchar |
address.zip |
varchar |
members_since |
timestamp |
Arrays¶
MongoDB Connector for BI exposes arrays to business intelligence tools using two collections: one without the array, and the other with the array “pre-joined” to its parent.
For example, consider the following document in the collection
families
:
This will result in the following three tables:
families
_id
numeric
familyName
varchar
hometown
varchar
families_familyMembers
_id
numeric
familyMembers.age
numeric
familyMembers.firstname
varchar
familyMembers_idx
numeric
familyName
varchar
hometown
varchar
families_familyMembers_attributes
_id
numeric
familyMembers.age
numeric
familyMembers.attributes.name
varchar
familyMembers.attributes.value
varchar
familyMembers.attributes_idx
numeric
familyMembers.firstname
varchar
familyMembers_idx
numeric
familyName
varchar
hometown
varchar
Aggregation Pipelines¶
The MongoDB Connector for BI can use aggregation pipelines as part of the schema to transform documents from the collection into the proper form for the relational tables.
For example, consider a simple document in the collection
simpleFamilies
:
mongodrdl
will generate a schema with the tables
simpleFamilies
and simpleFamilies_familyMembers
.
The table simpleFamilies_familyMembers
enumerates each family
members, and has the following pipeline:
This pipeline uses $unwind
to create a new record for each
member of familyMembers
. The schema will track the array index in
the field familyMembers_idx
.
Geospatial Data¶
New in version 1.1.0.
If a collection contains a geospatial index, the connector will map the indexed field into an array of numeric longitude-latitude coordinates.
For example, given the following collection:
mongodrdl
will generate the following schema: