Sharding

Sharding is a method for distributing data across multiple machines. MongoDB uses sharding to support deployments with very large data sets and high throughput operations.

Database systems with large data sets or high throughput applications can challenge the capacity of a single server. For example, high query rates can exhaust the CPU capacity of the server. Working set sizes larger than the system’s RAM stress the I/O capacity of disk drives.

There are two methods for addressing system growth: vertical and horizontal scaling.

Vertical Scaling involves increasing the capacity of a single server, such as using a more powerful CPU, adding more RAM, or increasing the amount of storage space. Limitations in available technology may restrict a single machine from being sufficiently powerful for a given workload. Additionally, Cloud-based providers have hard ceilings based on available hardware configurations. As a result, there is a practical maximum for vertical scaling.

Horizontal Scaling involves dividing the system dataset and load over multiple servers, adding additional servers to increase capacity as required. While the overall speed or capacity of a single machine may not be high, each machine handles a subset of the overall workload, potentially providing better efficiency than a single high-speed high-capacity server. Expanding the capacity of the deployment only requires adding additional servers as needed, which can be a lower overall cost than high-end hardware for a single machine. The trade off is increased complexity in infrastructure and maintenance for the deployment.

MongoDB supports horizontal scaling through sharding.

Sharded Cluster

A MongoDB sharded cluster consists of the following components:

  • shard: Each shard contains a subset of the sharded data. Each shard can be deployed as a replica set.
  • mongos: The mongos acts as a query router, providing an interface between client applications and the sharded cluster.
  • config servers: Config servers store metadata and configuration settings for the cluster. As of MongoDB 3.2, config servers can be deployed as a replica set.

The following graphic describes the interaction of components within a sharded cluster:

Diagram of a sample sharded cluster for production purposes.  Contains exactly 3 config servers, 2 or more ``mongos`` query routers, and at least 2 shards. The shards are replica sets.

MongoDB shards data at the collection level, distributing the collection data across the shards in the cluster.

Shard Keys

To distribute the documents in a collection, MongoDB partitions the collection using the shard key. The shard key consists of an immutable field or fields that exist in every document in the target collection.

You choose the shard key when sharding a collection. The choice of shard key cannot be changed after sharding. A sharded collection can have only one shard key. See Shard Key Specification.

To shard a non-empty collection, the collection must have an index that starts with the shard key. For empty collections, MongoDB creates the index if the collection does not already have an appropriate index for the specified shard key. See Shard Key Indexes.

The choice of shard key affects the performance, efficiency, and scalability of a sharded cluster. A cluster with the best possible hardware and infrastructure can be bottlenecked by the choice of shard key. The choice of shard key and its backing index can also affect the sharding strategy that your cluster can use.

See the shard key documentation for more information.

Chunks

MongoDB partitions sharded data into chunks. Each chunk has an inclusive lower and exclusive upper range based on the shard key.

MongoDB migrates chunks across the shards in the sharded cluster using the sharded cluster balancer. The balancer attempts to achieve an even balance of chunks across all shards in the cluster.

See Data Partitioning with Chunks for more information.

Advantages of Sharding

Reads / Writes

MongoDB distributes the read and write workload across the shards in the sharded cluster, allowing each shard to process a subset of cluster operations. Both read and write workloads can be scaled horizontally across the cluster by adding more shards.

For queries that include the shard key or the prefix of a compound shard key, mongos can target the query at a specific shard or set of shards. These targeted operations are generally more efficient than broadcasting to every shard in the cluster.

Storage Capacity

Sharding distributes data across the shards in the cluster, allowing each shard to contain a subset of the total cluster data. As the data set grows, additional shards increase the storage capacity of the cluster.

High Availability

A sharded cluster can continue to perform partial read / write operations even if one or more shards are unavailable. While the subset of data on the unavailable shards cannot be accessed during the downtime, reads or writes directed at the available shards can still succeed.

MongoDB 3.2 allows you to deploy config servers as replica sets. A sharded cluster with a Config Server Replica Set (CSRS) can continue to process reads and writes as long as a majority of the replica set is available.

In production environments, individual shards should be deployed as replica sets, providing increased redundancy and availability.

Considerations Before Sharding

Sharded cluster infrastructure requirements and complexity requires care in planning, execution, and maintenance. Sharding also has certain operational requirements.See Operational Restrictions in Sharded Clusters for more information.

For queries that do not include the shard key or the prefix of a compound shard key, mongos performs a broadcast operation, querying all shards in the sharded cluster. These scatter/gather queries can be long running operations.

Careful consideration of shard key is necessary for ensuring cluster performance and efficiency. You cannot change the shard key after sharding, nor can you unshard a sharded collection. See Choosing a Shard Key.

Note

If you have an active support contract with MongoDB, consider contacting your account representative for assistance with sharded cluster planning and deployment.

Sharded and Non-Sharded Collections

A database can have a mixture of sharded and unsharded collections. Sharded collections are partitioned and distributed across the shards in the cluster. Unsharded collections are stored on a primary shard. Each database has its own primary shard.

Diagram of a primary shard. A primary shard contains non-sharded collections as well as chunks of documents from sharded collections. Shard A is the primary shard.

Connecting to a Sharded Cluster

You must connect to a mongos router to interact with any collection in the sharded cluster. This includes sharded and unsharded collections. Clients should never connect to a single shard in order to perform read or write operations.

Diagram of applications/drivers issuing queries to mongos for unsharded collection as well as sharded collection. Config servers not shown.

You can connect to a mongos the same way you connect to a mongod, such as using the mongo shell or a MongoDB driver.

Sharding Strategy

MongoDB supports two sharding strategies for distributing data across sharded clusters.

Hashed Sharding

Hashed Sharding involves computing a hash of the shard key field’s value. Each chunk is then assigned a range based on the hashed shard key values.

Tip

MongoDB automatically computes the hashes when resolving queries using hashed indexes. Applications do not need to compute hashes.

Diagram of the hashed based segmentation.

While a range of shard keys may be “close”, their hashed values are unlikely to be on the same chunk. Data distribution based on hashed values facilitates more even data distribution, especially in data sets where the shard key changes monotonically.

However, hashed distribution means that ranged-based queries on the shard key are less likely to target a single shard, resulting in more cluster wide broadcast operations

See Hashed Sharding for more information.

Ranged Sharding

Ranged sharding involves dividing data into ranges based on the shard key values. Each chunk is then assigned a range based on the shard key values.

Diagram of the shard key value space segmented into smaller ranges or chunks.

A range of shard keys whose values are “close” are more likely to reside on the same chunk. This allows for targeted operations, as a mongos can route the query to only the shards that contain the required data.

The efficiency of ranged sharding depends on the shard key chosen. Poorly considered shard keys can result in uneven distribution of data, which can negate some benefits of sharding or can cause performance bottlenecks. See shard key selection for ranged sharding.

See Ranged Sharding for more information.

Tag Aware Sharding

In sharded clusters, you can tag specific ranges of the shard key and associate those tags with a shard or subset of shards. MongoDB routes reads and writes that fall into a tagged range only to those shards assigned that tag. Additionally, the balancer respects tags during balancing rounds by ensuring that each shard only contains data that does not violate its configured tag ranges.

Each tag has a range that consists of an inclusive lower bound and an exclusive upper bound. Administrators can assign one or more tags to each shard in the sharded cluster.

Diagram of tag-aware data distribution

A tag range must use fields from the shard key, respecting the order of the fields for compound shard keys. See shard keys in tag aware sharding.

When choosing a shard key, carefully consider the possibility of using tag aware sharding in the future, as you cannot change the shard key after sharding the collection.

Most commonly, tag aware sharding serves to improve the locality of data for sharded clusters that span multiple data centers.

See Tag Aware Sharding for more information.