Docs Menu

Docs HomeView & Analyze DataBI Connector

Sampling Type Conflicts

On this page

  • Scalar-Scalar Conflicts
  • Composite Conflicts

Changed in version 2.6.

The flexible schema model of MongoDB allows a given field to contain data of multiple types, while relational databases restrict columns to a single data type. The BI Connector samples data from MongoDB to generate a relational schema model, but type conversion conflicts may occur when the BI Connector samples different data types from a field in MongoDB documents.

There are two main categories of type conversion conflicts that may occur in the presence of multiple data types: conflicts between scalar types and conflicts involving composite types like documents and arrays.

Note

When a scalar-scalar confict occurs, the BI Connector uses the following lattice to determine the least upper bound of the two different data types:

Scalar-scalar type lattice
click to enlarge

Example

If a given field contains int, double, and string types, the least upper bound is string.

When a scalar-scalar conflict occurs, the BI Connector inserts a row into the information_schema.COLUMNS table and lists the different sampled types in the COLUMN_COMMENT column.

Composite conflicts are type conversion conflicts involving a document or an array. The following sections describe how the BI Connector resolves both types of composite conflicts.

When a conflict occurs involving a document, the BI Connector displays the fields of the document type as separate columns using dot notation. For example, the conflict collection contains the following documents:

{ _id: 0, a: "foo" } // "a" is scalar (string)
{ _id: 1, a: { a: "bar", b: "baz" } } // "a" is composite (document)

The BI Connector detects a scalar-composite conflict and generates the following relational schema:

Table: conflict

_id
a
a.a
a.b
0
"foo"
NULL
NULL
1
NULL
"bar"
"baz"

When a conflict occurs involving a document, the BI Connector inserts a row into the information_schema.COLUMNS table and lists the different sampled types in the COLUMN_COMMENT column.

When a conflict occurs involving an array, the BI Connector creates a new table for the conflict field with a column for the foreign key, array index, and value. The BI Connector unwinds the array into multiple rows in new table and fills in the columns accordingly. For example, the conflict collection contains the following documents:

{ _id: 0, a: "foo" } // "a" is scalar (string)
{ _id: 1, a: ["bar", "baz"] } // "a" is composite (array)

The BI Connector renders the above into the following two tables:

Table: conflict

_id
0
1

Table: conflict_a

_id
a_idx
a
0
NULL
"foo"
1
0
"bar"
1
1
"baz"

When a conflict occurs involving an array, the BI Connector:

  • Inserts a row into the information_schema.COLUMNS table and lists any different types sampled in the COLUMN_COMMENT column

  • Inserts a row into the information_schema.TABLES table containing information on how the array maps to the row in the information_schema.COLUMNS table

← Geospatial Data