- Replication >
- Replica Set Data Synchronization
Replica Set Data Synchronization¶
In order to maintain up-to-date copies of the shared data set, secondary members of a replica set sync or replicate data from other members. MongoDB uses two forms of data synchronization: initial sync to populate new members with the full data set, and replication to apply ongoing changes to the entire data set.
Initial sync copies all the data from one member of the replica set to another member.
When you perform an initial sync, MongoDB:
Changed in version 3.4: Initial sync builds all collection indexes as the documents are copied for each collection. In earlier versions of MongoDB, only the
_idindexes are built during this stage.
Changed in version 3.4: Initial sync pulls newly added oplog records during the data copy. Ensure that the target member has enough disk space in the
localdatabase to temporarily store these oplog records for the duration of this data copy stage.
Applies all changes to the data set. Using the oplog from the source, the
mongodupdates its data set to reflect the current state of the replica set.
To perform an initial sync, see Resync a Member of a Replica Set.
To recover from transient network or operation failures, initial sync has built-in retry logic.
Changed in version 3.4: MongoDB 3.4 improves the retry logic to be more resilient to intermittent failures on the network.
Secondary members replicate data continuously after the initial sync. Secondary members copy the oplog from their sync from source and apply these operations in an asynchronous process.
Secondaries may automatically change their sync from source as needed based on changes in the ping time and state of other members’ replication.
If a secondary member has
members[n].buildIndexes set to
it can only sync from other members where
true. Members where
sync from any other member, barring other sync restrictions.
true by default.
MongoDB applies write operations in batches using multiple threads to improve concurrency. MongoDB groups batches by namespace (MMAPv1) or by document id (WiredTiger) and simultaneously applies each group of operations using a different thread. MongoDB always applies write operations to a given document in their original write order.
While applying a batch, MongoDB blocks all read operations. As a result, secondary read queries can never return data that reflect a state that never existed on the primary.
Pre-Fetching Indexes to Improve Replication Throughput¶
Applies to MMAPv1 only.
With the MMAPv1 storage engine, MongoDB fetches memory pages that hold affected data and indexes to help improve the performance of applying oplog entries. This pre-fetch stage minimizes the amount of time MongoDB holds write locks while applying oplog entries. By default, secondaries will pre-fetch all Indexes.
Optionally, you can disable all pre-fetching or only pre-fetch
the index on the
_id field. See the
setting for more information.