Troubleshoot Sharded Clusters¶
On this page
- Application Servers or mongos Instances Become Unavailable
- A Single mongod Becomes Unavailable in a Shard
- All Members of a Shard Become Unavailable
- A Config Server Replica Set Member Become Unavailable
- Cursor Fails Because of Stale Config Data
- Shard Keys and Cluster Availability
- Config Database String Error
- Avoid Downtime when Moving Config Servers
- moveChunk commit failed Error
This page describes common strategies for troubleshooting sharded cluster deployments.
Cursor Fails Because of Stale Config Data¶
could not initialize cursor across all shards because : stale config detected
This warning should not propagate back to your application. The warning will repeat until all the mongos instances refresh their caches. To force an instance to refresh its cache, run the flushRouterConfig command.
Config Database String Error¶
Changed in version 3.2.
Starting in MongoDB 3.2, config servers can be deployed as replica sets. The mongos instances for the sharded cluster must specify the same config server replica set name but can specify hostname and port of different members of the replica set.
Starting in 3.4, the use of the deprecated mirrored mongod instances as config servers (SCCC) is no longer supported. Before you can upgrade your sharded clusters to 3.4, you must convert your config servers from SCCC to CSRS.
To convert your config servers from SCCC to CSRS, see Upgrade Config Servers to Replica Set.
With earlier versions of MongoDB sharded clusters that use the topology of three mirrored mongod instances for config servers, mongos instances in a sharded cluster must specify identical configDB string.
Avoid Downtime when Moving Config Servers¶
Use CNAMEs to identify your config servers to the cluster so that you can rename and renumber your config servers without downtime.
moveChunk commit failed Error¶
At the end of a chunk migration, the shard must connect to the config database to update the chunk’s record in the cluster metadata. If the shard fails to connect to the config database, MongoDB reports the following error:
ERROR: moveChunk commit failed: version is at <n>|<nn> instead of <N>|<NN>" and "ERROR: TERMINATING"
When this happens, the primary member of the shard’s replica set then terminates to protect data consistency. If a secondary member can access the config database, data on the shard becomes accessible again after an election.