Replica Set Data Synchronization¶
On this page
In order to maintain up-to-date copies of the shared data set, members of a replica set sync or replicate data from other members. MongoDB uses two forms of data synchronization: initial sync to populate new members with the full data set, and replication to apply ongoing changes to the entire data set.
Initial sync copies all the data from one member of the replica set to another member. A member uses initial sync when the member has no data, such as when the member is new, or when the member has data but is missing a history of the set’s replication.
When you perform an initial sync, MongoDB:
Clones all databases. To clone, the
mongodqueries every collection in each source database and inserts all data into its own copies of these collections. At this time,
_idindexes are also built. The clone process only copies valid data, omitting invalid documents.
Applies all changes to the data set. Using the oplog from the source, the
mongodupdates its data set to reflect the current state of the replica set.
Builds all indexes on all collections (except
_idindexes, which were already completed).
Changed in version 3.0: When the clone process omits an invalid document from the sync,
MongoDB writes a message to the logs that begins with
found corrupt document in <collection>.
To perform an initial sync, see Resync a Member of a Replica Set.
Replica set members replicate data continuously after the initial sync. This process keeps the members up to date with all changes to the replica set’s data. In most cases, secondaries synchronize from the primary. Secondaries may automatically change their sync targets if needed based on changes in the ping time and state of other members’ replication.
If a secondary member has
buildIndexes set to
can only sync from other members where
can sync from any other member, barring other sync restrictions.
true by default.
Validity and Durability¶
In a replica set, the set can have at most one primary and only the primary can accept write operations.  Secondaries apply operations from the primary asynchronously to provide eventual consistency.
Journaling provides single-instance write durability. Without journaling, if a MongoDB instance terminates ungracefully, you must assume that the database is in an invalid state.
In MongoDB, clients can see the results of writes before they are made durable:
MongoDB applies write operations in batches using multiple threads to improve concurrency. MongoDB groups batches by namespace (MMAPv1) or by document id (WiredTiger) and simultaneously applies each group of operations using a different thread. MongoDB always applies write operations to a given document in their original write order.
While applying a batch, MongoDB blocks all read operations. As a result, secondary read queries can never return data that reflect a state that never existed on the primary.
Pre-Fetching Indexes to Improve Replication Throughput¶
Applies to MMAPv1 only.
With the MMAPv1 storage engine, MongoDB fetches memory pages that hold affected data and indexes to help improve the performance of applying oplog entries. This pre-fetch stage minimizes the amount of time MongoDB holds write locks while applying oplog entries. By default, secondaries will pre-fetch all Indexes.
Optionally, you can disable all pre-fetching or only pre-fetch
the index on the
_id field. See the
setting for more information.
|||In some circumstances, two nodes in a replica set
may transiently believe that they are the primary, but at most, only
one of them will be able to complete writes with |