Navigation

Production Considerations

The following page lists some production considerations for running transactions. These apply whether you run transactions on replica sets or sharded clusters. For running transactions on sharded clusters, see also the Production Considerations (Sharded Clusters) for additional considerations that are specific to sharded clusters.

Availability

  • In version 4.0, MongoDB supports multi-document transactions on replica sets.

  • In version 4.2, MongoDB introduces distributed transactions, which adds support for multi-document transactions on sharded clusters and incorporates the existing support for multi-document transactions on replica sets.

    To use transactions on MongoDB 4.2 deployments(replica sets and sharded clusters), clients must use MongoDB drivers updated for MongoDB 4.2.

Distributed Transactions and Multi-Document Transactions

Starting in MongoDB 4.2, the two terms are synonymous. Distributed transactions refer to multi-document transactions on sharded clusters and replica sets. Multi-document transactions (whether on sharded clusters or replica sets) are also known as distributed transactions starting in MongoDB 4.2.

Feature Compatibility

To use transactions, the featureCompatibilityVersion for all members of the deployment must be at least:

Deployment Minimum featureCompatibilityVersion
Replica Set 4.0
Sharded Cluster 4.2

To check the fCV for a member, connect to the member and run the following command:

db.adminCommand( { getParameter: 1, featureCompatibilityVersion: 1 } )

For more information, see the setFeatureCompatibilityVersion reference page.

Runtime Limit

By default, a transaction must have a runtime of less than one minute. You can modify this limit using transactionLifetimeLimitSeconds for the mongod instances. For sharded clusters, the parameter must be modified for all shard replica set members. Transactions that exceeds this limit are considered expired and will be aborted by a periodic cleanup process.

For sharded clusters, you can also specify a maxTimeMS limit on commitTransaction. For more information, see Sharded Clusters Transactions Time Limit.

Oplog Size Limit

Starting in version 4.2,
MongoDB creates as many oplog entries as necessary to the encapsulate all write operations in a transaction, instead of a single entry for all write operations in the transaction. This removes the 16MB total size limit for a transaction imposed by the single oplog entry for all its write operations. Although the total size limit is removed, each oplog entry still must be within the BSON document size limit of 16MB.
In version 4.0,
MongoDB creates a single oplog (operations log) entry at the time of commit if the transaction contains any write operations. That is, the individual operations in the transactions do not have a corresponding oplog entry. Instead, a single oplog entry contains all of the write operations within a transaction. The oplog entry for the transaction must be within the BSON document size limit of 16MB.

WiredTiger Cache

To prevent storage cache pressure from negatively impacting the performance:

  • When you abandon a transaction, abort the transaction.
  • When you encounter an error during individual operation in the transaction, abort and retry the transaction.

The transactionLifetimeLimitSeconds also ensures that expired transactions are aborted periodically to relieve storage cache pressure.

Transactions and Security

Shard Configuration Restriction

You cannot run transactions on a sharded cluster that has a shard with writeConcernMajorityJournalDefault set to false (such as a shard with a voting member that uses the in-memory storage engine).

Sharded Clusters and Arbiters

Transactions whose write operations span multiple shards will error and abort if any transaction operation reads from or writes to a shard that contains an arbiter.

See also 3-Member Primary-Secondary-Arbiter Architecture for transaction restrictions on shards that have disabled read concern majority.

3-Member Primary-Secondary-Arbiter Architecture

For a three-member replica set with a primary-secondary-arbiter (PSA) architecture or a sharded cluster with a three-member PSA shards, you may have disabled read concern “majority” to avoid cache pressure.

On sharded clusters,
  • If a transaction involves a shard that has disabled read concern “majority”, you cannot use read concern "snapshot" for the transaction. You can only use read concern "local" or "majority" for the transaction. If you use read concern "snapshot", the transaction errors and aborts.

    readConcern level 'snapshot' is not supported in sharded clusters when enableMajorityReadConcern=false.
    
  • Transactions whose write operations span multiple shards will error and abort if any of the transaction’s read or write operations involves a shard that has disabled read concern "majority".

On replica set,

You can specify read concern "local" or "majority" or "snapshot" even in the replica set has disabled read concern “majority”.

However, if you are planning to transition to a sharded cluster with disabled read concern majority shards, you may wish to avoid using read concern "snapshot".

Tip

To check if read concern “majority” is disabled, You can run db.serverStatus() on the mongod instances and check the storageEngine.supportsCommittedReads field. If false, read concern “majority” is disabled.

Acquiring Locks

By default, transactions wait up to 5 milliseconds to acquire locks required by the operations in the transaction. If the transaction cannot acquire its required locks within the 5 milliseconds, the transaction aborts.

Transactions release all locks upon abort or commit.

Tip

When creating or dropping a collection immediately before starting a transaction, if the collection is accessed within the transaction, issue the create or drop operation with write concern "majority" to ensure that the transaction can acquire the required locks.

Lock Request Timeout

You can use the maxTransactionLockRequestTimeoutMillis parameter to adjust how long transactions wait to acquire locks. Increasing maxTransactionLockRequestTimeoutMillis allows operations in the transactions to wait the specified time to acquire the required locks. This can help obviate transaction aborts on momentary concurrent lock acquisitions, like fast-running metadata operations. However, this could possibly delay the abort of deadlocked transaction operations.

You can also use operation-specific timeout by setting maxTransactionLockRequestTimeoutMillis to -1.

Pending DDL Operations and Transactions

Multi-document transactions acquire exclusive locks on the collections which they access. New DDL operations that require locks on those collections or their parent databases must wait until the transaction releases its locks. While these pending DDL operations exist, new transactions that access the same collection(s) as the pending DDL operations cannot obtain the required locks and and will abort after waiting maxTransactionLockRequestTimeoutMillis. In addition, new non-transaction operations that access the same collection(s) will block until they reach their maxTimeMS limit.

Consider the following scenarios:

DDL Operation That Requires a Collection Lock

While an in-progress transaction is performing various CRUD operations on the employees collection in the hr database, an administrator issues the db.collection.createIndex() DDL operation against the employees collection. createIndex() requires an exclusive collection lock on the collection.

Until the in-progress transaction completes, the createIndex() operation must wait to obtain the lock. Any new transaction that affects the employees collection and starts while the createIndex() is pending must wait until after createIndex() completes.

The pending createIndex() DDL operation does not affect transactions on other collections in the hr database. For example, a new transaction on the contractors collection in the hr database can start and complete as normal.

DDL Operation That Requires a Database Lock

While an in-progress transaction is performing various CRUD operations on the employees collection in the hr database, an administrator issues the collMod DDL operation against the contractors collection in the same database. collMod requires a database lock on the parent hr database.

Until the in-progress transaction completes, the collMod operation must wait to obtain the lock. Any new transaction that affects the hr database or any of its collections and starts while the collMod is pending must wait until after collMod completes.

In either scenario, if the DDL operation remains pending for more than maxTransactionLockRequestTimeoutMillis, pending transactions waiting behind that operation abort. That is, the value of maxTransactionLockRequestTimeoutMillis must at least cover the time required for the in-progress transaction and the pending DDL operation to complete.

In-progress Transactions and Write Conflicts

If a transaction is in progress and a write outside the transaction modifies a document that an operation in the transaction later tries to modify, the transaction aborts because of a write conflict.

If a transaction is in progress and has taken a lock to modify a document, when a write outside the transaction tries to modify the same document, the write waits until the transaction ends.

In-progress Transactions and Stale Reads

Read operations inside a transaction can return stale data. That is, read operations inside a transaction are not guaranteed to see writes performed by other committed transactions or non-transactional writes. For example, consider the following sequence: 1) a transaction is in-progress 2) a write outside the transaction deletes a document 3) a read operation inside the transaction is able to read the now-deleted document since the operation is using a snapshot from before the write.

To avoid stale reads inside transactions for a single document, you can use the db.collection.findOneAndUpdate() method. For example:

session.startTransaction( { readConcern: { level: "snapshot" }, writeConcern: { w: "majority" } } );

employeesCollection = session.getDatabase("hr").employees;

employeeDoc = employeesCollection.findOneAndUpdate(
   { _id: 1, employee: 1, status: "Active" },
   { $set: { employee: 1 } },
   { returnNewDocument: true }
);
  • If the employee document has changed outside the transaction, then the transaction aborts.
  • If the employee document has not changed, the transaction returns the document and locks the document.

In-progress Transactions and Chunk Migration

Chunk migration acquires exclusive collection locks during certain stages.

If an ongoing transaction has a lock on a collection and a chunk migration that involves that collection starts, these migration stages must wait for the transaction to release the locks on the collection, thereby impacting the performance of chunk migrations.

If a chunk migration interleaves with a transaction (for instance, if a transaction starts while a chunk migration is already in progress and the migration completes before the transaction takes a lock on the collection), the transaction errors during the commit and aborts.

Depending on how the two operations interleave, some sample errors include (the error messages have been abbreviated):

  • an error from cluster data placement change ... migration commit in progress for <namespace>
  • Cannot find shardId the chunk belonged to at cluster time ...

Outside Reads During Commit

During the commit for a transaction, outside read operations may try to read the same documents that will be modified by the transaction. If the transaction writes to multiple shards, then during the commit attempt across the shards

  • Outside reads that use read concern snapshot or "linearizable", or are part of causally consistent sessions (i.e. include afterClusterTime) wait for all writes of a transaction to be visible.
  • Outside reads using other read concerns do not wait for all writes of a transaction to be visible but instead read the before-transaction version of the documents available.

Errors

Use of MongoDB 4.0 Drivers

To use transactions on MongoDB 4.2 deployments(replica sets and sharded clusters), clients must use MongoDB drivers updated for MongoDB 4.2.

On sharded clusters with multiple mongos instances, performing transactions with drivers updated for MongoDB 4.0 (instead of MongoDB 4.2) will fail and can result in errors, including:

Note

Your driver may return a different error. Refer to your driver’s documentation for details.

Error Code Error Message
251 cannot continue txnId -1 for session ... with txnId 1
50940 cannot commit with no participants

Additional Information