Navigation

Deploy Sharded Cluster using Hashed Sharding

Overview

Hashed shard keys use a hashed index of a single field as the shard key to partition data across your sharded cluster.

Hashed sharding provides a more even data distribution across the sharded cluster at the cost of reducing Targeted Operations vs. Broadcast Operations. With hashed sharding, documents with “close” shard key values are unlikely to be on the same chunk or shard, and the mongos is more likely to perform Broadcast Operations to fulfill a given query.

If you already have a sharded cluster deployed, skip to Shard a Collection using Hashed Sharding.

Atlas, CloudManager and OpsManager

If you are currently using or are planning to use Atlas, Cloud Manager or Ops Manager, refer to their respective manual for instructions on deploying a sharded cluster:

Considerations

Hostnames and Configuration

Tip

When possible, use a logical DNS hostname instead of an ip address, particularly when configuring replica set members or sharded cluster members. The use of logical DNS hostnames avoids configuration changes due to ip address changes.

Operating System

This tutorial uses the mongod and mongos programs. Windows users should use the mongod.exe and mongos.exe programs instead.

IP Binding

Use the bind_ip option to ensure that MongoDB listens for connections from applications on configured addresses.

Changed in version 3.6: Starting in MongoDB 3.6, MongoDB binaries, mongod and mongos, bind to localhost by default. If the net.ipv6 configuration file setting or the --ipv6 command line option is set for the binary, the binary additionally binds to the localhost IPv6 address.

Previously, starting from MongoDB 2.6, only the binaries from the official MongoDB RPM (Red Hat, CentOS, Fedora Linux, and derivatives) and DEB (Debian, Ubuntu, and derivatives) packages bind to localhost by default.

When bound only to the localhost, these MongoDB 3.6 binaries can only accept connections from clients (including the mongo shell, other members in your deployment for replica sets and sharded clusters) that are running on the same machine. Remote clients cannot connect to the binaries bound only to localhost.

To override and bind to other ip addresses, you can use the net.bindIp configuration file setting or the --bind_ip command-line option to specify a list of hostnames or ip addresses.

Warning

Before you bind to other ip addresses, consider enabling access control and other security measures listed in Security Checklist to prevent unauthorized access.

For example, the following mongod instance binds to both the localhost and the hostname My-Example-Associated-Hostname, which is associated with the ip address 198.51.100.1:

mongod --bind_ip localhost,My-Example-Associated-Hostname

In order to connect to this instance, remote clients must specify the hostname or its associated ip address 198.51.100.1:

mongo --host My-Example-Associated-Hostname

mongo --host 198.51.100.1

Security

This tutorial does not include the required steps for configuring Internal Authentication or Role-Based Access Control. See Deploy Sharded Cluster with Keyfile Access Control for a tutorial on deploying a sharded cluster with a keyfile.

In production environments, sharded clusters should employ at minimum x.509 security for internal authentication and client access:

Note

Enabling internal authentication also enables Role-Based Access Control.

Deploy Sharded Cluster with Hashed Sharding

Tip

When possible, use a logical DNS hostname instead of an ip address, particularly when configuring replica set members or sharded cluster members. The use of logical DNS hostnames avoids configuration changes due to ip address changes.

Create the Config Server Replica Set

The following steps deploys a config server replica set.

For a production deployment, deploy a config server replica set with at least three members. For testing purposes, you can create a single-member replica set.

For this tutorial, the config server replica set members are associated with the following hosts:

Config Server Replica Set Member Hostname
Member 0 cfg1.example.net
Member 1 cfg2.example.net
Member 2 cfg3.example.net
1

Start each member of the config server replica set.

When starting each mongod, specify the mongod settings either via a configuration file or the command line.

Configuration File

If using a configuration file, set:

sharding:
  clusterRole: configsvr
replication:
  replSetName: <replica set name>
net:
  bindIp: localhost,<hostname(s)|ip address(es)>
  • sharding.clusterRole to configsvr,

  • replication.replSetName to the desired name of the config server replica set,

  • net.bindIp option to the hostname/ip address or comma-delimited list of hostnames or ip addresses that remote clients (including the other members of the config server replica set as well as other members of the sharded cluster) can use to connect to the instance.

    Warning

    Before you bind to other ip addresses, consider enabling access control and other security measures listed in Security Checklist to prevent unauthorized access.

  • Additional settings as appropriate to your deployment, such as storage.dbPath and net.port. For more information on the configuration file, see configuration options.

Start the mongod with the --config option set to the configuration file path.

mongod --config <path-to-config-file>

Command Line

If using the command line options, start the mongod with the --configsvr, --replSet, --bind_ip, and other options as appropriate to your deployment. For example:

Warning

Before you bind to other ip addresses, consider enabling access control and other security measures listed in Security Checklist to prevent unauthorized access.

mongod --configsvr --replSet <replica set name> --dbpath <path> --bind_ip localhost,<hostname(s)|ip address(es)>

For more information on startup parameters, see the mongod reference page.

2

Connect to one of the config servers.

Connect a mongo shell to one of the config server members.

mongo --host <hostname> --port <port>
3

Initiate the replica set.

From the mongo shell, run the rs.initiate() method.

rs.initiate() can take an optional replica set configuration document. In the replica set configuration document, include:

  • The _id set to the replica set name specified in either the replication.replSetName or the --replSet option.
  • The configsvr field set to true for the config server replica set.
  • The members array with a document per each member of the replica set.

Important

Run rs.initiate() on just one and only one mongod instance for the replica set.

rs.initiate(
  {
    _id: "<replSetName>",
    configsvr: true,
    members: [
      { _id : 0, host : "cfg1.example.net:27019" },
      { _id : 1, host : "cfg2.example.net:27019" },
      { _id : 2, host : "cfg3.example.net:27019" }
    ]
  }
)

See Replica Set Configuration for more information on replica set configuration documents.

Once the config server replica set (CSRS) is initiated and up, proceed to creating the shard replica sets.

Create the Shard Replica Sets

For a production deployment, use a replica set with at least three members for each shard. For testing purposes, you can create a single-member replica set.

For each shard, use the following steps to create the shard replica set.

1

Start each member of the shard replica set.

When starting each mongod, specify the mongod settings either via a configuration file or the command line.

Configuration File

If using a configuration file, set:

sharding:
   clusterRole: shardsvr
replication:
   replSetName: <replSetName>
net:
   bindIp: localhost,<ip address>

Start the mongod with the --config option set to the configuration file path.

mongod --config <path-to-config-file>

Command Line

If using the command line option, start the mongod with the --replSet, and --shardsvr, --bind_ip options, and other options as appropriate to your deployment. For example:

mongod --shardsvr --replSet <replSetname>  --dbpath <path> --bind_ip localhost,<hostname(s)|ip address(es)>

For more information on startup parameters, see the mongod reference page.

2

Connect to one member of the shard replica set.

Connect a mongo shell to one of the replica set members.

mongo --host <hostname> --port <port>
3

Initiate the replica set.

From the mongo shell, run the rs.initiate() method.

rs.initiate() can take an optional replica set configuration document. In the replica set configuration document, include:

  • The _id field set to the replica set name specified in either the replication.replSetName or the --replSet option.
  • The members array with a document per each member of the replica set.

The following example initiates a three member replica set.

Important

Run rs.initiate() on just one and only one mongod instance for the replica set.

rs.initiate(
  {
    _id : <replicaSetName>,
    members: [
      { _id : 0, host : "s1-mongo1.example.net:27018" },
      { _id : 1, host : "s1-mongo2.example.net:27018" },
      { _id : 2, host : "s1-mongo3.example.net:27018" }
    ]
  }
)

Connect a mongos to the Sharded Cluster

1

Connect a mongos to the cluster

Start a mongos using either a configuration file or a command line parameter to specify the config servers.

Configuration File

If using a configuration file, set the sharding.configDB to the config server replica set name and at least one member of the replica set in <replSetName>/<host:port> format.

Warning

Before you bind to other ip addresses, consider enabling access control and other security measures listed in Security Checklist to prevent unauthorized access.

sharding:
  configDB: <configReplSetName>/cfg1.example.net:27019,cfg2.example.net:27019
net:
  bindIp: localhost,<hostname(s)|ip address(es)>

Start the mongos specifying the --config option and the path to the configuration file.

mongos --config <path-to-config>

For more information on the configuration file, see configuration options.

Command Line

If using command line parameters start the mongos and specify the --configdb, --bind_ip, and other options as appropriate to your deployment. For example:

Warning

Before you bind to other ip addresses, consider enabling access control and other security measures listed in Security Checklist to prevent unauthorized access.

mongos --configdb <configReplSetName>/cfg1.example.net:27019,cfg2.example.net:27019 --bind_ip localhost,<hostname(s)|ip address(es)>

Include any other options as appropriate for your deployment.

2

Connect to the mongos.

Connect a mongo shell to the mongos.

mongo --host <hostname> --port <port>

Once you have connected the mongo shell to the mongos, continue to the next procedure to add shards to the cluster.

Add Shards to the Cluster

In the mongo shell connected to the mongos, use the sh.addShard() method to add each shard to the cluster.

The following operation adds a single shard replica set to the cluster:

sh.addShard( "<replSetName>/s1-mongo1.example.net:27018")

Repeat to add all shards.

Enable Sharding for a Database

From the mongo shell connected to the mongos, use the sh.enableSharding() method to enable sharding on the target database. Enabling sharding on a database makes it possible to shard collections within a database.

sh.enableSharding("<database>")

Shard a Collection using Hashed Sharding

From the mongo shell connected to the mongos, use the sh.shardCollection() method to shard a collection.

Note

You must have enabled sharding for the database where the collection resides. See Enable Sharding for a Database.

If the collection already contains data, you must create a Hashed Indexes on the shard key using the db.collection.createIndex() method before using shardCollection(). [1]

If the collection is empty, MongoDB creates the index as part of sh.shardCollection().

The following operation shards the target collection using the hashed sharding strategy.

sh.shardCollection("<database>.<collection>", { <shard key> : "hashed" } )
  • You must specify the full namespace of the collection and the shard key.
  • Your selection of shard key affects the efficiency of sharding, as well as your ability to take advantage of certain sharding features such as zones. See the selection considerations listed in the Hashed Sharding Shard Key.
[1]Starting in version 4.0, the mongo shell provides the method convertShardKeyToHashed(). This method uses the same hashing function as the hashed index and can be used to see what the hashed value would be for a key.