Troubleshoot Replica Sets¶
On this page
This section describes common strategies for troubleshooting replica set deployments.
Check Replica Set Status¶
To display the current state of the replica set and current state of
each member, run the
rs.status() method in a
mongosh session that is connected to the replica set's
primary. For descriptions of the information displayed by
rs.status(), see replSetGetStatus.
Check the Replication Lag¶
Replication lag is a delay between an operation on the primary and the application of that operation from the oplog to the secondary. Replication lag can be a significant issue and can seriously affect MongoDB replica set deployments. Excessive replication lag makes "lagged" members ineligible to quickly become primary and increases the possibility that distributed read operations will be inconsistent.
To check the current length of replication lag:
syncedTovalue for each member, which shows the time when the last oplog entry was written to the secondary, as shown in the following example:
source: m1.example.net:27017 syncedTo: Thu Apr 10 2014 10:27:47 GMT-0400 (EDT) 0 secs (0 hrs) behind the primary source: m2.example.net:27017 syncedTo: Thu Apr 10 2014 10:27:47 GMT-0400 (EDT) 0 secs (0 hrs) behind the primary
- Monitor the rate of replication by checking for non-zero or increasing oplog time values in the Replication Lag graph available in Cloud Manager and in Ops Manager.
Replication Lag Causes¶
Possible causes of replication lag include:
Check the network routes between the members of your set to ensure that there is no packet loss or network routing issue.
Use tools including
pingto test latency between set members and
tracerouteto expose the routing of packets network endpoints.
If the file system and disk device on the secondary is unable to flush data to disk as quickly as the primary, then the secondary will have difficulty keeping state. Disk-related issues are incredibly prevalent on multi-tenant systems, including virtualized instances, and can be transient if the system accesses disk devices over an IP network (as is the case with Amazon's EBS system.)
Use system-level tools to assess disk status, including
In some cases, long-running operations on the primary can block replication on secondaries. For best results, configure write concern to require confirmation of replication to secondaries. This prevents write operations from returning if replication cannot keep up with the write load.
You can also use the database profiler to see if there are slow queries or long-running operations that correspond to the incidences of lag.
Appropriate Write Concern
If you are performing a large data ingestion or bulk load operation that requires a large number of writes to the primary, particularly with
unacknowledged write concern, the secondaries will not be able to read the oplog fast enough to keep up with changes.
To prevent this, request write acknowledgement write concern after every 100, 1,000, or another interval to provide an opportunity for secondaries to catch up with the primary.
For more information see:
Starting in MongoDB 4.2, administrators can limit the rate at which
the primary applies its writes with the goal of keeping the
committed lag under
a configurable maximum value
By default, flow control is
With flow control enabled, as the lag grows close to the
flowControlTargetLagSeconds, writes on the primary must obtain
tickets before taking locks to apply writes. By limiting the number of
tickets issued per second, the flow control mechanism attempts to keep
the lag under the target.
For information on flow control statistics, see:
Slow Application of Oplog Entries¶
Starting in version 4.2 (also available starting in 4.0.6), secondary members of a replica set now
log oplog entries that take longer than the slow
operation threshold to apply. These slow oplog messages are logged
for the secondaries in the
diagnostic log under the
REPL component with the text
op: <oplog entry> took <num>ms. These slow oplog entries depend
only on the slow operation threshold. They do not depend on the log
levels (either at the system or component level), or the profiling
level, or the slow operation sample rate. The profiler does not
capture slow oplog entries.
Test Connections Between all Members¶
All members of a replica set must be able to connect to every other member of the set to support replication. Always verify connections in both "directions." Networking topologies and firewall configurations can prevent normal and required connectivity, which can block replication.
Changed in version 3.6: Starting in MongoDB 3.6, MongoDB binaries,
mongos, bind to localhost by default. If the
net.ipv6 configuration file setting or the
command line option is set for the binary, the binary additionally binds
to the localhost IPv6 address.Previously, starting from MongoDB 2.6, only the binaries from the
official MongoDB RPM (Red Hat, CentOS, Fedora Linux, and derivatives)
and DEB (Debian, Ubuntu, and derivatives) packages bind to localhost by
default.When bound only to the localhost, these MongoDB 3.6 binaries can only
accept connections from clients (including
other members of your deployment in replica sets and sharded clusters)
that are running on the same machine. Remote clients cannot connect to
the binaries bound only to localhost.To override and bind to other ip addresses, you can use the
net.bindIp configuration file setting or the
--bind_ip command-line option to specify a list of hostnames or ip
mongodinstance binds to both the localhost and the hostname
My-Example-Associated-Hostname, which is associated with the ip address
mongod --bind_ip localhost,My-Example-Associated-Hostname
mongosh --host My-Example-Associated-Hostname mongosh --host 198.51.100.1
Consider the following example of a bidirectional test of networking:
Given a replica set with three members running on three separate hosts:
All three use the default port
Test the connection from
m1.example.netto the other hosts with the following operation set
mongosh --host m2.example.net --port 27017 mongosh --host m3.example.net --port 27017
Test the connection from
m2.example.netto the other two hosts with the following operation set from
m2.example.net, as in:
mongosh --host m1.example.net --port 27017 mongosh --host m3.example.net --port 27017
You have now tested the connection between
m1.example.netin both directions.
Test the connection from
m3.example.netto the other two hosts with the following operation set from the
m3.example.nethost, as in:
mongosh --host m1.example.net --port 27017 mongosh --host m2.example.net --port 27017
If any connection, in any direction fails, check your networking and firewall configuration and reconfigure your environment to allow these connections.
Socket Exceptions when Rebooting More than One Secondary¶
When you reboot members of a replica set, ensure that the set is able
to elect a primary during the maintenance. This means ensuring that a majority of
When a set's active members can no longer form a majority, the set's primary steps down and becomes a secondary. Starting in MongoDB 4.2, when the primary steps down, it no longer closes all client connections. In MongoDB 4.0 and earlier, when the primary steps down, it closes all client connections.
Clients cannot write to the replica set until the members elect a new primary.
Given a three-member replica set where every member has one vote, the set can elect a primary if at least two members can connect to each other. If you reboot the two secondaries at once, the primary steps down and becomes a secondary. Until at least another secondary becomes available, i.e. at least one of the rebooted secondaries also becomes available, the set has no primary and cannot elect a new primary.
For more information on votes, see Replica Set Elections. For
related information on connection errors, see Does TCP
keepalive time affect MongoDB Deployments?.
Check the Size of the Oplog¶
A larger oplog can give a replica set a greater tolerance for lag, and make the set more resilient.
The output displays the size of the oplog and the date ranges of the operations contained in the oplog. In the following example, the oplog is about 10 MB and is able to fit about 26 hours (94400 seconds) of operations:
configured oplog size: 10.10546875MB log length start to end: 94400 (26.22hrs) oplog first event time: Mon Mar 19 2012 13:50:38 GMT-0400 (EDT) oplog last event time: Wed Oct 03 2012 14:59:10 GMT-0400 (EDT) now: Wed Oct 03 2012 15:00:21 GMT-0400 (EDT)
The oplog should be long enough to hold all transactions for the longest downtime you expect on a secondary.  At a minimum, an oplog should be able to hold minimum 24 hours of operations; however, many users prefer to have 72 hours or even a week's work of operations.
For more information on how oplog size affects operations, see:
You normally want the oplog to be the same size on all members. If you resize the oplog, resize it on all members.
To change oplog size, see the Change the Size of the Oplog tutorial.
||| Starting in MongoDB 4.0, the oplog can grow past its configured size
limit to avoid deleting the |