- MongoDB Connector for BI Components >
mongosqld
mongosqld
¶
On this page
Description¶
-
mongosqld
¶
Note
The MongoDB Connector for BI and associated utilities are compatible only with MongoDB server version 3.2 or greater.
mongosqld
accepts incoming requests from a SQL client and
proxies those requests to a mongod
or
mongos
instance.
Usage¶
New in version 2.3:
You can start mongosqld
either with a schema file in
.drdl format using the
--schema
option or by sampling data
from a MongoDB instance to create the schema.
You can specify which namespace or namespaces to sample data from with
the --sampleNamespaces
option. If you don’t specify any
namespaces or a schema file, mongosqld
samples data from all
databases in the target MongoDB instance except the admin
and
local
databases.
You can specify a database in which to store schema information with
the --sampleSource
option. Otherwise, mongosqld
holds the schema in memory.
Starting mongosqld
with a Schema File¶
Use the --schema
option to
specify a schema file when starting mongosqld
.
Use mongodrdl
to create a schema file from a MongoDB
instance.
Starting mongosqld
with a Schema Database¶
Use the --sampleSource
option to specify a database to store schema information.
Starting mongosqld
with Specified Namespaces¶
Use the --sampleNamespaces
option to specify databases and collections for
mongosqld
to sample data from to
create the schema.
See additional usage examples below.
MongoDB User Permissions¶
If your MongoDB instance uses authentication, your BI Connector instance must also use
authentication. The user that connects to MongoDB
via the mongosqld
program must have permission to list
all of the databases in your MongoDB deployment, as well as
permission to read from all the namespaces to wish to read data from.
For more details about MongoDB user permissions in BI Connector, see User Permissions for Cached Sampling.
For more information on MongoDB users and roles, see Role-Based Access Control.
Command Line Options¶
Core Options¶
-
--help
¶
Returns information on the options and use of mongosqld.
-
--addr
¶
Default: 127.0.0.1:3307
Specifies the host address to listen on.
-
--version
¶
Returns the mongosqld release number.
-
--config
<path>
¶ Specifies the path to a configuration file.
-
--mongo-uri
<uri>
¶ Default: mongodb://localhost:27017
Specifies a MongoDB connection string to connect to.
The
--mongo-uri
option supports the following options within the connection string:For more information on these URI options see Read Preference Options and Replica Set Option.
For options set in the Mongo URI not included in the list above, use the equivalent mongosqld option. For the complete list of mongosqld options, see Command Line Options.
Note
Instead of specifying a
username
andpassword
in your connection string, run mongosqld with the--auth
option to direct mongosqld to pass the authentication credentials provided by the SQL client to the MongoDB server.Similarly, instead of enabling
ssl
in the connection string, run mongosqld with--mongo-ssl
.To disable automatic replica set server discovery logic and force a connection to the specified server, use the
connect=direct
option.URI options not in the list above nor in the list of supported mongosqld options are not supported.
-
--maxVarcharLength
<length>
¶ New in version 2.2.
Specifies the maximum length, in characters, for all varchar fields. If mongosqld encounters a string that is longer than the maximum length, mongosqld truncates it to the maximum length and logs a warning.
-
--mongo-username
<username>
,
-u
<username>
¶ New in version 2.3.
Specifies the authentication username to use for schema discovery. Only required if
--auth
is enabled. The user specified by--mongo-username
must be a valid MongoDB user with thelistDatabases
privilege. See mongosqld User Permissions.
-
--mongo-password
<password>
,
-p
<password>
¶ New in version 2.3.
Specifies the authentication password to use for schema discovery. Only required if
--auth
is enabled. Use in conjunction with--mongo-username
.
-
--mongo-authenticationSource
<auth-db-name>
¶ Default: admin
New in version 2.3.
Specifies the database that holds the credentials for the schema discovery user. Only available if
--auth
is enabled. Use in conjunction with credential options--mongo-username
and--mongo-password
.
-
--mongo-authenticationMechanism
<[SCRAM-SHA-1|PLAIN]>
¶ Default: SCRAM-SHA-1
New in version 2.3.
Specifies the authentication mechanism to use for schema discovery. Only available if
--auth
is enabled. Use in conjunction with credential options--mongo-username
and--mongo-password
.
Schema Options¶
-
--schema
<filename>
¶ Specifies the path to a schema file or the schema directory.
-
--sampleNamespaces
<db.collection>
¶ New in version 2.3.
--sampleNamespaces
specifies a database and collection to sample data from to create the schema. It is also possible to specify multiple collections from a single databases, or multiple collections from multiple databases. See examples below.If you do not use the
--sampleNamespaces
option or the--schema
option,mongosqld
samples data from all available MongoDB databases and collections except theadmin
andlocal
databases.
-
--sampleMode
<[read|write]>
¶ Default: read
New in version 2.3.
Specifies whether
mongosqld
writes schema information to the MongoDB database named in the--sampleSource
option or only reads previously sampled schema information from it.If you use the
--sampleSource
option to specify a database to store schema information, set--sampleMode
towrite
to allowmongosqld
to persist data in the specified database.Important
If
mongosqld
hasauthentication
enabled, the authenticated user must have thewrite
privilege on the specified--sampleSource
database. See Built-In Roles for more information about thereadWrite
role.See Sampling Mode Reference Chart for more information on sampling modes.
-
--sampleSource
<db-name>
¶ New in version 2.3.
Required when
--sampleMode
is set to write. Specifies a database to store schema information.Note
If you do not specify either a schema file with the
--schema
option or a database with the--sampleSource
option,mongosqld
holds its schema in memory.See Sampling Mode Reference Chart for more information on sampling modes.
-
--mongo-versionCompatibility
<version-number>
¶ Restricts mongosqld to using features that the specified version of MongoDB supports. Only necessary when used with replica sets in which members use different MongoDB versions or when performing a rolling upgrade of MongoDB. Only supports MongoDB version 3.2 or later.
For example, if your replica set contains members running MongoDB 3.2 and other members running MongoDB 3.4, set the following option to restrict mongosqld to only use features supported by MongoDB 3.2:
-
--sampleSize
<number>
¶ Default: 1000
New in version 2.3.
The number of documents per database to sample when gathering schema information.
-
--sampleRefreshIntervalSecs
<number>
¶ Default: 0
New in version 2.3.
The interval in seconds at which
mongosqld
re-samples data to create its schema. The default value is0
, which means that after the initial sampling no re-sampling occurs for the duration of the connection. The specified value must be a positive integer.If you specify a non-zero value for
--sampleRefreshIntervalSecs
and a schema database with the--sampleSource
option, you must also set the--sampleMode
option towrite
. Otherwise,mongosqld
halts with an error.To force a one-time update of the schema, use the FLUSH SAMPLE command from your SQL client.
-
--uuidSubtype3Encoding
<old|csharp|java>
,
-b
<old|csharp|java>
¶ Specify the encoding used to generate UUID binary subtype 3. Choose one of the following values:
old
: Old BSON binary subtype representationcsharp
: The C#/.NET legacy UUID representationjava
: The Java legacy UUID representation
Log Options¶
-
--logAppend
¶
Appends new logging output to an existing log file specified by
--logPath
.Requires
--logRotate
.
-
--logPath
<filename>
¶ Default: stderr
Specifies a path to a log file for storing logging output.
-
--logRotate
reopen | rename
¶ Default: rename
Specifies that you want to rotate logs and how they should be rotated.
When this option is set, the logs rotate when you issue a
FLUSH LOGS
command to the MongoDB Connector for BI or when you restartmongosqld
.- If you set
--logRotate
torename
: - The existing log file is closed. An RFC3339-formatted timestamp is appended to the closed log file. A new log file is created.
- If you set
--logRotate
toreopen
: - The existing log file is closed and reopened.
- If you set
-
--verbose
,
-v
¶
Specifies that mongosqld should provide more detailed log output.
The following table describes the information provided at each log level:
Letter Option Log Level Message Content --quiet
None Log nothing. No flag Default Log messages that notify user of basic mongosqld
events and state changes.-v
Administrator Log messages that provie information to mongosqld
administrators.-vv
Developer Log messages useful to MongoDB support and development.
-
--quiet
¶
Hides all log output.
MongoDB TLS/SSL Options¶
-
--mongo-ssl
¶
Default: False
Instructs mongosqld to use TLS/SSL when connecting to a MongoDB instance.
-
--mongo-sslPEMKeyFile
<filename>
¶ Specifies the
.pem
file containing both the TLS/SSL certificate and key formongosqld
to use when connecting to MongoDB. You can specify the file name of the.pem
file using either using a relative or absolute path.This option is required when using the
--mongo-ssl
option to connect to amongod
ormongos
that hasCAFile
enabled withoutnet.ssl.allowConnectionsWithoutCertificates
.
-
--mongo-sslPEMKeyPassword
<password>
¶ Specifies the path to a file containing the certificate and private key for connecting to MongoDB.
-
--mongo-sslAllowInvalidHostnames
¶
Permits
mongosqld
to connect to a MongoDB server whose hostname differs from the hostname on its TLS/SSL certificate.
-
--mongo-sslAllowInvalidCertificates
¶
Permits the MongoDB instance to present an invalid server SSL/TLS certificate. When using the
allowInvalidCertificates
setting, MongoDB logs the use of the invalid certificate as a warning.
-
--mongo-sslCAFile
<filename>
¶ Specifies the MongoDB instance’s
.pem
file containing the root certificate chain from the Certificate Authority. Specify the file name of the.pem
file using relative or absolute paths.Warning
For SSL connections (
--mongo-ssl
) tomongod
andmongos
, if the mongosqld runs without the--mongo-sslCAFile
, mongosqld will not attempt to validate the server certificates. This creates a vulnerability to expiredmongod
andmongos
certificates as well as to foreign processes posing as validmongod
ormongos
instances. Ensure that you always specify the CA file to validate the server certificates in cases where intrusion is a possibility.
-
--mongo-sslCRLFile
<filename>
¶ Specifies the MongoDB instance’s
.pem
file containing the certificate revocation list.
-
--mongo-sslFIPSMode
¶
Enables FIPS mode in the installed OpenSSL library.
Client TLS/SSL Options¶
-
--sslMode
<mode>
¶ Default: disabled
New in version 2.3.
Enable or disable TLS/SSL for connections to mongosqld. The argument to the
sslMode
option can be one of the following:Value Description disabled
mongosqld cannot accept connections secured using TLS/SSL. allowSSL
mongosqld can accept connections secured using TLS/SSL. requireSSL
mongosqld can only accept connections secured using TLS/SSL.
-
--sslPEMKeyFile
<filename>
¶ Specifies the
.pem
file containing both the TLS/SSL certificate and key for MySQL clients. Specify the file name of the.pem
file using relative or absolute paths.
-
--sslPEMKeyPassword
<password>
¶ Specifies the password used to decrypt the private key specified by
--sslPEMKeyFile
.
-
--sslAllowInvalidCertificates
¶
Permits MySQL clients to present invalid client TLS/SSL certificates.
-
--sslAllowInvalidHostnames
¶
Permits mongosqld to connect to a MySQL server whose hostname differs from the hostname on its TLS/SSL certificate.
-
--sslCAFile
<filename>
¶ Specifies the MySQL client’s
.pem
file containing the root certificate chain from the Certificate Authority. Specify the file name of the.pem
file using relative or absolute paths.
-
--sslCRLFile
<filename>
¶ Specifies the MySQL client’s
.pem
file containing the certificate revocation list.
-
--auth
¶
Requires authentication for incoming client requests.
-
--defaultAuthSource
<authSource>
¶ Default: admin
Specifies the default MongoDB authentication source. Set this value to specify a default source that mongosqld uses when authenticating with a MongoDB database. The authentication mechanisms
GSSAPI
andPLAIN
use the$external
source, whileSCRAM-SHA-1
uses a MongoDB database as its source.If no value is given for this option it defaults to the the MongoDB
admin
database.The
$external
authentication source stores a reference to system users in a MongoDB database called$external
, but the credentials are stored in an external, non-MongoDB system, such as an LDAP server.Any connection which uses the default value can omit the
source
parameter from its MySQL or Tableau username.
-
--defaultAuthMechanism
<authMechanism>
¶ Default: SCRAM-SHA-1
Specifies the default authentication mechanism. Set this value to specify a default mechanism for connecting to mongosqld. Any connection which uses this specified default value can omit the
mechanism
value from its MySQL or Tableau username.
Service Options¶
Socket Options¶
-
--filePermissions
<mode>
¶ Default: 448
Specify the permissions for the Unix domain socket file.
-
--noUnixSocket
¶
Disable listening on Unix domain sockets.
-
--unixSocketPrefix
<path>
¶ Default: /tmp
Specifies an alternative directory for the mongosqld Unix domain socket.
mongosqld will create a socket file called
mysql.sock
underneath this path. If you do not specify--unixSocketPrefix
, the socket will exist at/tmp/mysql.sock
.
Configuration File¶
You may configure mongosqld
using a YAML
configuration file. This file may contain the settings listed in the
following sections.
Logging Options¶
systemLog:
logAppend: <boolean>
logRotate: "rename"|"reopen"
path: <string>
quiet: <boolean>
verbosity: <integer>
Name | Type | Corresponds to |
---|---|---|
|
boolean | --logAppend |
|
string | --logRotate |
|
string | --logPath |
|
boolean | --quiet |
|
integer | --verbose |
Schema Options¶
schema:
path: <string>
maxVarcharLength: <integer>
Name | Type | Corresponds to |
---|---|---|
|
string | --schema |
|
integer | --maxVarcharLength |
Data Sampling Options¶
schema:
sample:
mode: <[read|write]>
source: <string>
size: <integer>
namespaces: <array of strings>
refreshIntervalSecs: <integer>
uuidSubtype3Encoding: <[old|csharp|java]>
Name | Type | Corresponds to |
---|---|---|
|
string | --sampleMode |
|
string | --sampleSource |
|
integer | --sampleSize |
|
string or array of strings | --sampleNamespaces |
|
integer | --sampleRefreshIntervalSecs |
|
string | --uuidSubtype3Encoding |
Runtime Options¶
runtime:
memory:
maxPerStage: <integer>
-
runtime.memory.
maxPerStage
¶ Type: integer
Default: unlimited
Specifies the maximum amount of memory in bytes that a query execution stage may use.
Network Options¶
net:
bindIp: <string>
port: <integer>
unixDomainSocket:
enabled: <boolean>
pathPrefix: <string>
filePermissions: <string>
ssl:
mode: <string>
allowInvalidCertificates: <boolean>
PEMKeyFile: <string>
PEMKeyPassword: <string>
CAFile: <string>
Name | Type | Corresponds to |
---|---|---|
|
string | The hostname component of Changed in version 2.2: To bind to multiple IP addresses, enter a list of comma separated values. Example “72.198.41.200,72.198.41.201,72.198.41.202” |
|
integer | The port component of --addr |
|
boolean | Inverse of --noUnixSocket |
|
string | --unixSocketPrefix |
|
string | --filePermissions |
|
boolean | --sslAllowInvalidCertificates |
|
string | --sslMode |
|
string | --sslPEMKeyFile |
|
string | --sslPEMKeyPassword |
|
string | --sslCAFile |
Security Options¶
security:
enabled: <boolean>
defaultMechanism: <string>
defaultSource: <string>
Name | Type | Corresponds to |
---|---|---|
|
boolean | --auth |
|
string | --defaultAuthMechanism |
|
string | --defaultAuthSource |
MongoDB Host Options¶
mongodb:
versionCompatibility: <string>
net:
uri: <string>
ssl:
enabled: <boolean>
allowInvalidCertificates: <boolean>
allowInvalidHostnames: <boolean>
PEMKeyFile: <string>
PEMKeyPassword: <string>
CAFile: <string>
CRLFile: <string>
FIPSMode: <boolean>
auth:
username: <username>
password: <password>
source: <auth-db-name>
mechanism: <auth-mechanism>
Name | Type | Corresponds to |
---|---|---|
|
string | --mongo-versionCompatibility |
|
string | --mongo-uri |
|
boolean | --mongo-ssl |
|
boolean | --mongo-sslAllowInvalidCertificates |
|
boolean | --mongo-sslAllowInvalidHostnames |
|
string | --mongo-sslPEMKeyFile |
|
string | --mongo-sslPEMKeyPassword |
|
string | --mongo-sslCAFile |
|
string | --mongo-sslCRLFile |
|
boolean | --mongo-sslFIPSMode |
|
boolean | --mongo-username |
|
boolean | --mongo-password |
|
boolean | --mongo-authenticationSource |
|
boolean | --mongo-authenticationMechanism |
Process Management Options¶
processManagement:
service:
name: <string>
displayName: <string>
description: <string>
Name | Type | Corresponds to |
---|---|---|
|
string | --serviceName |
|
string | --serviceDisplayName |
|
string | --serviceDescription |
Example Configuration File¶
Note
The paths used in this example configuration file are Linux-specific. Check your local system documentation to determine the correct paths for your system.
Usage Examples¶
mongosqld
with a Schema File¶
Use the --schema
option to specify a
schema file for mongosqld
to use.
Specify a Collection to Sample¶
Use the --sampleNamespaces
option to specify a database
and collection for mongosqld
to sample data from. If you
start mongosqld
without the
--sampleNamespaces
option, it samples data from all available databases except
the admin
and local
databases.
The following example samples data from a collection named
addresses
in a database named contacts
:
Specify Multiple Collections to Sample¶
To sample data from all the collections in a database, use the
--sampleNamespaces
option with a wildcard (*
).
The following example samples data from all the collections in a
database named inventory
:
You can repeat the --sampleNamespaces
at the command line to
sample data from multiple collections within a database, or from
multiple collections across several databases.
The following example samples data from all the collections in
databases named cars
and boats
:
You can mix and match wildcards and specific collections with multiple
--sampleNamespaces
options.
The following example samples data from all collections in a database
named orders
, as well as the FY2015
and
FY2016
collections from the expenses
database:
Note
It is only necessary to enclose the namespace in single quotes when you use a wildcard.
Sample All Documents in a Collection¶
If you have a collection in which some documents have different
fields from others and you want to ensure that the schema takes them
all into consideration, set the --sampleSize
option to 0
. This causes mongosqld
to examine all documents in the specified namespace
when it builds its schema.
The following example instructs mongosqld
to build the
schema based on all documents in the inventory.electronics
namespace.
Specify a Database to Persist a Schema¶
Use the --sampleSource
option to
specify a database in which to store schema information. Use the
--sampleMode
option to specify
whether mongosqld
can write to the schema database or
only read from it.
The following example command uses a database named sampleDb
to
store schema information and sets --sampleMode
to write
.
Important
If mongosqld
has authentication
enabled, the authenticated user must have the write
privilege on the specified --sampleSource
database. See Built-In Roles for more information about the
readWrite
role.
Specify a Data Resampling Interval¶
Use the --sampleRefreshIntervalSecs
option to specify
an interval in seconds for mongosqld
to resample data
and regenerate the schema. The default value for this option is 0
,
which means that mongosqld
never resamples data.
If you set a resampling interval with
--sampleRefreshIntervalSecs
and you specify a schema
database with --sampleSource
, you
cannot set --sampleMode
to read
.
The following example does not specify a schema database or a schema
file, so it holds its schema in memory. It uses
--sampleRefreshIntervalSecs
to specify a data resampling
interval of 3600 seconds.
mongosqld
with Authentication Enabled¶
To connect mongosqld
to a MongoDB instance running
with authentication enabled,
use the --auth
option and provide MongoDB user
credentials.
The following example starts mongosqld
with
credentials for a user named reportsUser
with password
reportsPass
, who has access to the reports.inventory
namespace.
This example does not specify an authenticating database with the
--mongo-authenticationSource
option, so it defaults to
admin
. It does not specify an authentication mechanism with the
--mongo-authenticationMechanism
option, so it defaults to
SCRAM-SHA-1
.
mongosqld
with a Configuration File¶
If you wish to specify a configuration file which saves logs to
/var/log/mongosqld.log
and loads a schema from /var/schema.drdl
,
you may save a file such as the following to /etc/mongosqld.conf
:
Warning
All paths specified in the configuration file must be absolute,
e.g. they must begin with /
.
You may then start mongosqld
with the
--config
option:
For more information on starting mongosqld
as a system service, see
the Installation Guide for your operating
system.
Usage with Atlas¶
Atlas is a cloud service for running, monitoring, and maintaining MongoDB deployments, including the provisioning of dedicated servers for the MongoDB instances.
Atlas uses TLS/SSL to encrypt connections and enforces authentication by default.
Note
With the MongoDB Atlas free tier, BI Connector cannot run aggregation pipelines
using the allowDiskUse
option. This option allows aggregation stages to
write data as temporary files to disk. To review the MongoDB Atlas free
tier constraints, see the MongoDB Atlas manual.
To connect the
MongoDB Connector for BI
to an Atlas cluster, you must provide mongosqld
with a
.pem
key file.
mongosqld
can use any valid TLS certificate issued by a
certificate authority, or a self-signed certificate. If you use a
self-signed certificate, although the communications channel will be
encrypted, there will be no validation of server identity. Although
such a situation will prevent eavesdropping on the connection, it
leaves you vulnerable to a man-in-the-middle attack. Using a
certificate signed by a trusted certificate authority will permit your
MySQL client to verify the server’s identity.
For testing purposes, you can create a .pem
key file named test.pem
using the openssl
tool:
Start mongosqld
with the --auth
and --sslPEMKeyFile
options.
If you named your key file test.pem
as in the previous example, you
can run the following, substituting your schema file and
cluster URI:
Note
Do not specify a username and password in --mongo-uri
. The
connection string should only contain the list of servers.
Pass your username, password, and authentication database to your SQL
client. For example, using mysql
without verifying your mongosqld
server certificate:
Refer to Connect from the MySQL Client for more details on using the
mysql
client with the BI Connector.
Sampling Mode Reference Chart¶
The chart below describes types of mongosqld
startup configurations and the options used
with each.
Mode | mongosqld Options |
Behavior |
---|---|---|
Standalone Reader |
|
Standalone
mongod instance. Example |
Clustered Reader |
|
In a replica set or sharded cluster environment, sample data only at startup. Read schema data from specified schema database on the primary. Example |
Clustered Writer |
|
Read and persist schema data in a user-specified schema database. Example |
Invalid Configurations¶
The following mongosqld
option configurations are invalid
and cause an error at startup.
Mode | mongosqld Options |
Behavior |
---|---|---|
Standalone Writer (invalid) |
|
This configuration is invalid for a standalone
mongod instance because in write mode there
must be a writeable database specified. |
Clustered Sampling Reader (invalid) |
|
When used in a MongoDB replica set or sharded cluster,
this configuration is invalid. When a
database is specified for storing schema data with
--sampleSource , --sampleMode must be set to
write so that the schema may be updated from
the primary to
maintain cluster-wide consistency. |