Navigation

Deploy Sharded Cluster using Hashed Sharding

Overview

Hashed shard keys use a hashed index of a single field as the shard key to partition data across your sharded cluster.

Hashed sharding provides a more even data distribution across the sharded cluster at the cost of reducing Query Isolation. With hashed sharding, documents with “close” shard key values are unlikely to be on the same chunk or shard, and the mongos is more likely to perform Broadcast Operations to fulfill a given query.

If you already have a sharded cluster deployed, skip to Shard a Collection using Hashed Sharding.

Atlas, CloudManager and OpsManager

If you are currently using or are planning to use Atlas, Cloud Manager or Ops Manager, refer to their respective manual for instructions on deploying a sharded cluster:

Considerations

Operating System

This tutorial uses the mongod and mongos programs. Windows users should use the mongod.exe and mongos.exe programs instead.

IP Binding

Use the bind_ip option to ensure that MongoDB listens for connections from applications on configured addresses.

Changed in version 3.6: Starting in MongoDB 3.6, MongoDB binaries, mongod and mongos, bind to localhost (127.0.0.1) by default. If the net.ipv6 configuration file setting or the --ipv6 command line option is set for the binary, the binary additionally binds to the IPv6 address ::1.

Previously, starting from MongoDB 2.6, only the binaries from the official MongoDB RPM (Red Hat, CentOS, Fedora Linux, and derivatives) and DEB (Debian, Ubuntu, and derivatives) packages bind to localhost by default.

When bound only to the localhost, these MongoDB 3.6 binaries can only accept connections from clients (including the mongo shell, other members in your deployment for replica sets and sharded clusters) that are running on the same machine. Remote clients cannot connect to the binaries bound only to localhost.

To override and bind to other ip addresses, you can use the net.bindIp configuration file setting or the --bind_ip command-line option to specify a list of ip addresses.

Warning

Before you bind to other ip addresses, consider enabling access control and other security measures listed in Security Checklist to prevent unauthorized access.

For example, the following mongod instance binds to both the localhost and the sample ip address 198.51.100.1:

mongod --bind_ip localhost,198.51.100.1

In order to connect to this instance, remote clients must specify the ip address 198.51.100.1 or the hostname associated with the ip address:

mongo --host 198.51.100.1

mongo --host My-Example-Associated-Hostname

Security

This tutorial does not include the required steps for configuring Internal Authentication or Role-Based Access Control. See Deploy Sharded Cluster with Keyfile Access Control for a tutorial on deploying a sharded cluster with a keyfile.

In production environments, sharded clusters should employ at minimum x.509 security for internal authentication and client access:

Note

Enabling internal authentication also enables Role-Based Access Control.

Deploy Sharded Cluster with Hashed Sharding

Create the Config Server Replica Set

The following steps deploys a config server replica set.

For a production deployment, deploy a config server replica set with at least three members. For testing purposes, you can create a single-member replica set.

For this tutorial, the config server replica set members are associated with the following hosts:

Config Server Replica Set Member Hostname
Member 0 cfg1.example.net
Member 1 cfg2.example.net
Member 2 cfg3.example.net
1

Start each member of the config server replica set.

When starting each mongod, specify the mongod settings either via a configuration file or the command line.

Configuration File

If using a configuration file, set:

sharding:
  clusterRole: configsvr
replication:
  replSetName: <replica set name>
net:
  bindIp: localhost,<ip address>

Start the mongod with the --config option set to the configuration file path.

mongod --config <path-to-config-file>

Command Line

If using the command line options, start the mongod with the --configsvr, --replSet, --bind_ip, and other options as appropriate to your deployment. For example:

Warning

Before you bind to other ip addresses, consider enabling access control and other security measures listed in Security Checklist to prevent unauthorized access.

mongod --configsvr --replSet <replica set name> --dbpath <path> --bind_ip localhost,<ip address of the mongod host>

For more information on startup parameters, see the mongod reference page.

2

Connect to one of the config servers.

Connect a mongo shell to one of the config server members.

mongo --host <hostname> --port <port>
3

Initiate the replica set.

From the mongo shell, run the rs.initiate() method.

rs.initiate() can take an optional replica set configuration document. In the replica set configuration document, include:

  • The _id set to the replica set name specified in either the replication.replSetName or the --replSet option.
  • The configsvr field set to true for the config server replica set.
  • The members array with a document per each member of the replica set.

Important

Run rs.initiate() on just one and only one mongod instance for the replica set.

rs.initiate(
  {
    _id: "<replSetName>",
    configsvr: true,
    members: [
      { _id : 0, host : "cfg1.example.net:27019" },
      { _id : 1, host : "cfg2.example.net:27019" },
      { _id : 2, host : "cfg3.example.net:27019" }
    ]
  }
)

See Replica Set Configuration for more information on replica set configuration documents.

Once the config server replica set (CSRS) is initiated and up, proceed to creating the shard replica sets.

Create the Shard Replica Sets

For a production deployment, use a replica set with at least three members for each shard. For testing purposes, you can create a single-member replica set.

For each shard, use the following steps to create the shard replica set.

1

Start each member of the shard replica set.

When starting each mongod, specify the mongod settings either via a configuration file or the command line.

Configuration File

If using a configuration file, set:

sharding:
   clusterRole: shardsvr
replication:
   replSetName: <replSetName>
net:
   bindIp: localhost,<ip address>

Start the mongod with the --config option set to the configuration file path.

mongod --config <path-to-config-file>

Command Line

If using the command line option, start the mongod with the --replSet, and --shardsvr, --bind_ip options, and other options as appropriate to your deployment. For example:

mongod --shardsvr --replSet <replSetname>  --dbpath <path> --bind_ip localhost,<ip address of the mongod host>

For more information on startup parameters, see the mongod reference page.

2

Connect to one member of the shard replica set.

Connect a mongo shell to one of the replica set members.

mongo --host <hostname> --port <port>
3

Initiate the replica set.

From the mongo shell, run the rs.initiate() method.

rs.initiate() can take an optional replica set configuration document. In the replica set configuration document, include:

  • The _id field set to the replica set name specified in either the replication.replSetName or the --replSet option.
  • The members array with a document per each member of the replica set.

The following example initiates a three member replica set.

Important

Run rs.initiate() on just one and only one mongod instance for the replica set.

rs.initiate(
  {
    _id : <replicaSetName>,
    members: [
      { _id : 0, host : "s1-mongo1.example.net:27018" },
      { _id : 1, host : "s1-mongo2.example.net:27018" },
      { _id : 2, host : "s1-mongo3.example.net:27018" }
    ]
  }
)

Connect a mongos to the Sharded Cluster

1

Connect a mongos to the cluster

Start a mongos using either a configuration file or a command line parameter to specify the config servers.

Configuration File

If using a configuration file, set the sharding.configDB to the config server replica set name and at least one member of the replica set in <replSetName>/<host:port> format.

Warning

Before you bind to other ip addresses, consider enabling access control and other security measures listed in Security Checklist to prevent unauthorized access.

sharding:
  configDB: <configReplSetName>/cfg1.example.net:27019,cfg2.example.net:27019
net:
  bindIp: localhost,<ip address>

Start the mongos specifying the --config option and the path to the configuration file.

mongos --config <path-to-config>

For more information on the configuration file, see configuration options.

Command Line

If using command line parameters start the mongos and specify the --configdb, --bind_ip, and other options as appropriate to your deployment. For example:

Warning

Before you bind to other ip addresses, consider enabling access control and other security measures listed in Security Checklist to prevent unauthorized access.

mongos --configdb <configReplSetName>/cfg1.example.net:27019,cfg2.example.net:27019 --bind_ip localhost,<ip address of the mongos host>

Include any other options as appropriate for your deployment.

2

Connect to the mongos.

Connect a mongo shell to the mongos.

mongo --host <hostname> --port <port>

Once you have connected the mongo shell to the mongos, continue to the next procedure to add shards to the cluster.

Add Shards to the Cluster

In the mongo shell connected to the mongos, use the sh.addShard() method to add each shard to the cluster. If the shard is a replica set, specify the name of the replica set and specify a member of the set.

Tip

In production deployments, all shards should be replica sets.

The following operation adds a single shard replica set to the cluster:

sh.addShard( "<replSetName>/s1-mongo1.example.net:27018")

Repeat to add all shards.

If in a development environment, the shard is a standalone mongod instance, specify the instance’s hostname and port. The following operation is an example of adding a standalone mongod shard to the cluster:

sh.addShard( "s1-mongo1.example.net:27018")

Enable Sharding for a Database

From the mongo shell connected to the mongos, use the sh.enableSharding() method to enable sharding on the target database. Enabling sharding on a database makes it possible to shard collections within a database.

sh.enableSharding("<database>")

Shard a Collection using Hashed Sharding

From the mongo shell connected to the mongos, use the sh.shardCollection() method to shard a collection.

Note

You must have enabled sharding for the database where the collection resides. See Enable Sharding for a Database.

If the collection already contains data, you must create a Hashed Indexes on the shard key using the db.collection.createIndex() method before using shardCollection().

If the collection is empty, MongoDB creates the index as part of sh.shardCollection().

The following operation shards the target collection using the hashed sharding strategy.

sh.shardCollection("<database>.<collection>", { <shard key> : "hashed" } )
  • You must specify the full namespace of the collection and the shard key.
  • Your selection of shard key affects the efficiency of sharding, as well as your ability to take advantage of certain sharding features such as zones. See the selection considerations listed in the Hashed Sharding Shard Key.