Navigation
This version will reach end of life on Feb 2018. To upgrade, go to the Learn more about upgrading your version of MongoDB.

Split Chunks in a Sharded Cluster

Normally, MongoDB splits a chunk after an insert if the chunk exceeds the maximum chunk size. However, you may want to split chunks manually if:

  • you have a large amount of data in your cluster and very few chunks, as is the case after deploying a cluster using existing data.
  • you expect to add a large amount of data that would initially reside in a single chunk or shard. For example, you plan to insert a large amount of data with shard key values between 300 and 400, but all values of your shard keys are between 250 and 500 are in a single chunk.

Note

New in version 2.6: MongoDB provides the mergeChunks command to combine contiguous chunk ranges into a single chunk. See Merge Chunks in a Sharded Cluster for more information.

The balancer may migrate recently split chunks to a new shard immediately if mongos predicts future insertions will benefit from the move. The balancer does not distinguish between chunks split manually and those split automatically by the system.

Warning

Be careful when splitting data in a sharded collection to create new chunks. When you shard a collection that has existing data, MongoDB automatically creates chunks to evenly distribute the collection. To split data effectively in a sharded cluster you must consider the number of documents in a chunk and the average document size to create a uniform chunk size. When chunks have irregular sizes, shards may have an equal number of chunks but have very different data sizes. Avoid creating splits that lead to a collection with differently sized chunks.

Use sh.status() to determine the current chunk ranges across the cluster.

To split chunks manually, use the split command with either fields middle or find. The mongo shell provides the helper methods sh.splitFind() and sh.splitAt().

splitFind() splits the chunk that contains the first document returned that matches this query into two equally sized chunks. You must specify the full namespace (i.e. “<database>.<collection>”) of the sharded collection to splitFind(). The query in splitFind() does not need to use the shard key, though it nearly always makes sense to do so.

Example

The following command splits the chunk that contains the value of 63109 for the zipcode field in the people collection of the records database:

sh.splitFind( "records.people", { "zipcode": "63109" } )

Use splitAt() to split a chunk in two, using the queried document as the lower bound in the new chunk:

Example

The following command splits the chunk that contains the value of 63109 for the zipcode field in the people collection of the records database.

sh.splitAt( "records.people", { "zipcode": "63109" } )

Note

splitAt() does not necessarily split the chunk into two equally sized chunks. The split occurs at the location of the document matching the query, regardless of where that document is in the chunk.