Navigation

Use MongoDB Views

On this page

Overview

Another way to control the data visible to BI Connector without using a .drdl file is to read from a view. Views allow you to take a source collection and choose which fields to display, change the names of fields, sort data, and perform any kind of transformation that the aggregation pipeline can accomplish. For example, you can use views to redact sensitive information from a collection and direct BI Connector to read from the view, rather than the source collection.

Use the mongo shell method createView to define a view on a collection and then use that view as a specified namespace when starting mongosqld.

Examples

Consider a collection named salaries in the employees database. By creating a view on the salaries collection that excludes all personally identifiable information, BI Connector can safely facilitate data analysis on the view without compromising the private data in the source collection.

The following command starts mongosqld with data from a view named salaries_redacted in the employees database:

mongosqld --sampleNamespaces "employees.salaries_redacted"

Another use case for managing your schema with a MongoDB view is if your working data set includes a sparsely populated but important field which may be excluded in the BI Connector sampling process.

Consider a collection named sparseCollection which contains one field which occurs in every document and another field which occurs only occasionally. A portion of the collection might look like this:

{ "_id" : 0, "mainField": 247 }
{ "_id" : 1, "mainField": 71 }
{ "_id" : 2, "mainField": 981 }
{ "_id" : 3, "mainField": 712 }
{ "_id" : 4, "mainField": 781, "sparseField": 1000 }
{ "_id" : 5, "mainField": 256 }
{ "_id" : 6, "mainField": 910 }

To ensure that sparseField is included when BI Connector creates its schema, create a view in which sparseField is populated with zeroes in every document in which it does not occur.

db.runCommand (
  {
    create: "denseCollection",
    viewOn: "sparseCollection",
    pipeline: [
      {
        $addFields: {
           rareField: { $ifNull: [ "$rareField", 0 ] }
        }
      }
    ]
  }
)

To use the view as a data source when starting mongosqld, specify it with the --sampleNamespaces option.

mongosqld --sampleNamespaces "myDatabase.denseCollection"