Navigation

Replica Set Deployment Architectures

The architecture of a replica set affects the set’s capacity and capability. This document provides strategies for replica set deployments and describes common architectures.

The standard replica set deployment for production system is a three-member replica set. These sets provide redundancy and fault tolerance. Avoid complexity when possible, but let your application requirements dictate the architecture.

Strategies

Determine the Number of Members

Add members in a replica set according to these strategies.

Maximum Number of Voting Members

A replica set can have up to 50 members, but only 7 voting members. If the replica set already has 7 voting members, additional members must be non-voting members.

Deploy an Odd Number of Members

Ensure that the replica set has an odd number of voting members. If you have an even number of voting members, deploy an arbiter so that the set has an odd number of voting members.

An arbiter does not store a copy of the data and requires fewer resources. As a result, you may run an arbiter on an application server or other shared process. With no copy of the data, it may be possible to place an arbiter into environments that you would not place other members of the replica set. Consult your security policies.

For the following MongoDB versions, pv1 increases the likelihood of w:1 rollbacks compared to pv0 (no longer supported in MongoDB 4.0+) for replica sets with arbiters:

  • MongoDB 3.4.1
  • MongoDB 3.4.0
  • MongoDB 3.2.11 or earlier

See Replica Set Protocol Version.

Warning

In general, avoid deploying more than one arbiter per replica set.

Consider Fault Tolerance

Fault tolerance for a replica set is the number of members that can become unavailable and still leave enough members in the set to elect a primary. In other words, it is the difference between the number of members in the set and the majority of voting members needed to elect a primary. Without a primary, a replica set cannot accept write operations. Fault tolerance is an effect of replica set size, but the relationship is not direct. See the following table:

Number of Members Majority Required to Elect a New Primary Fault Tolerance
3 2 1
4 3 1
5 3 2
6 4 2

Adding a member to the replica set does not always increase the fault tolerance. However, in these cases, additional members can provide support for dedicated functions, such as backups or reporting.

Use Hidden and Delayed Members for Dedicated Functions

Add hidden or delayed members to support dedicated functions, such as backup or reporting.

Load Balance on Read-Heavy Deployments

In a deployment with very high read traffic, you can improve read throughput by distributing reads to secondary members. As your deployment grows, add or move members to alternate data centers to improve redundancy and availability.

Note

Distributing replica set members across two data centers provides benefit over a single data center. In a two data center distribution,

  • If one of the data centers goes down, the data is still available for reads unlike a single data center distribution.
  • If the data center with a minority of the members goes down, the replica set can still serve write operations as well as read operations.
  • However, if the data center with the majority of the members goes down, the replica set becomes read-only.

If possible, distribute members across at least three data centers. For config server replica sets (CSRS), the best practice is to distribute across three (or more depending on the number of members) centers. If the cost of the third data center is prohibitive, one distribution possibility is to evenly distribute the data bearing members across the two data centers and store the remaining member (either a data bearing member or an arbiter to ensure odd number of members) in the cloud if your company policy allows.

Always ensure that the main facility is able to elect a primary.

Add Capacity Ahead of Demand

The existing members of a replica set must have spare capacity to support adding a new member. Always add new members before the current demand saturates the capacity of the set.

Distribute Members Geographically

To protect your data in case of a data center failure, keep at least one member in an alternate data center. If possible, use an odd number of data centers, and choose a distribution of members that maximizes the likelihood that even with a loss of a data center, the remaining replica set members can form a majority or at minimum, provide a copy of your data.

Note

Distributing replica set members across two data centers provides benefit over a single data center. In a two data center distribution,

  • If one of the data centers goes down, the data is still available for reads unlike a single data center distribution.
  • If the data center with a minority of the members goes down, the replica set can still serve write operations as well as read operations.
  • However, if the data center with the majority of the members goes down, the replica set becomes read-only.

If possible, distribute members across at least three data centers. For config server replica sets (CSRS), the best practice is to distribute across three (or more depending on the number of members) centers. If the cost of the third data center is prohibitive, one distribution possibility is to evenly distribute the data bearing members across the two data centers and store the remaining member (either a data bearing member or an arbiter to ensure odd number of members) in the cloud if your company policy allows.

To ensure that the members in your main data center be elected primary before the members in the alternate data center, set the members[n].priority of the members in the alternate data center to be lower than that of the members in the primary data center.

For more information, see Replica Sets Distributed Across Two or More Data Centers

Target Operations with Tag Sets

Use replica set tag sets to target read operations to specific members or to customize write concern to request acknowledgement from specific members.

Use Journaling to Protect Against Power Failures

MongoDB enables journaling by default. Journaling protects against data loss in the event of service interruptions, such as power failures and unexpected reboots.

Replica Set Naming

If your application connects to more than one replica set, each set should have a distinct name. Some drivers group replica set connections by replica set name.

Deployment Patterns

The following documents describe common replica set deployment patterns. Other patterns are possible and effective depending on the application’s requirements. If needed, combine features of each architecture in your own deployment:

Three Member Replica Sets
Three-member replica sets provide the minimum recommended architecture for a replica set.
Replica Sets Distributed Across Two or More Data Centers
Geographically distributed sets include members in multiple locations to protect against facility-specific failures, such as power outages.