Navigation

Journaling

To provide durability in the event of a failure, MongoDB uses write ahead logging to on-disk journal files.

Journaling and the WiredTiger Storage Engine

Important

The log mentioned in this section refers to the WiredTiger write-ahead log (i.e. the journal) and not the MongoDB log file.

WiredTiger uses checkpoints to provide a consistent view of data on disk and allow MongoDB to recover from the last checkpoint. However, if MongoDB exits unexpectedly in between checkpoints, journaling is required to recover information that occurred after the last checkpoint.

Note

Starting in MongoDB 4.0, you cannot specify --nojournal option or storage.journal.enabled: false for replica set members that use the WiredTiger storage engine.

With journaling, the recovery process:

  1. Looks in the data files to find the identifier of the last checkpoint.
  2. Searches in the journal files for the record that matches the identifier of the last checkpoint.
  3. Apply the operations in the journal files since the last checkpoint.

Journaling Process

Changed in version 3.2.

With journaling, WiredTiger creates one journal record for each client initiated write operation. The journal record includes any internal write operations caused by the initial write. For example, an update to a document in a collection may result in modifications to the indexes; WiredTiger creates a single journal record that includes both the update operation and its associated index modifications.

MongoDB configures WiredTiger to use in-memory buffering for storing the journal records. Threads coordinate to allocate and copy into their portion of the buffer. All journal records up to 128 kB are buffered.

WiredTiger syncs the buffered journal records to disk according to the following intervals or conditions:

  • New in version 3.2: Every 50 milliseconds.

  • Changed in version 3.6: MongoDB sets checkpoints to occur in WiredTiger on user data at an interval of 60 seconds.

  • If the write operation includes a write concern of j: true, WiredTiger forces a sync of the WiredTiger journal files.

  • Because MongoDB uses a journal file size limit of 100 MB, WiredTiger creates a new journal file approximately every 100 MB of data. When WiredTiger creates a new journal file, WiredTiger syncs the previous journal file.

Important

In between write operations, while the journal records remain in the WiredTiger buffers, updates can be lost following a hard shutdown of mongod.

See also

The serverStatus command returns information on the WiredTiger journal statistics in the wiredTiger.log field.

Journal Files

For the journal files, MongoDB creates a subdirectory named journal under the dbPath directory. WiredTiger journal files have names with the following format WiredTigerLog.<sequence> where <sequence> is a zero-padded number starting from 0000000001.

Journal files contain a record per each write operation. Each record has a unique identifier.

MongoDB configures WiredTiger to use snappy compression for the journaling data.

Minimum log record size for WiredTiger is 128 bytes. If a log record is 128 bytes or smaller, WiredTiger does not compress that record.

WiredTiger journal files for MongoDB have a maximum size limit of approximately 100 MB. Once the file exceeds that limit, WiredTiger creates a new journal file.

WiredTiger automatically removes old journal files to maintain only the files needed to recover from last checkpoint.

WiredTiger will pre-allocate journal files.

Journaling and the In-Memory Storage Engine

Starting in MongoDB Enterprise version 3.2.6, the In-Memory Storage Engine is part of general availability (GA). Because its data is kept in memory, there is no separate journal. Write operations with a write concern of j: true are immediately acknowledged.

If any voting member of a replica set uses the in-memory storage engine, you must set writeConcernMajorityJournalDefault to false.

With writeConcernMajorityJournalDefault set to false, MongoDB does not wait for w: "majority" writes to be written to the on-disk journal before acknowledging the writes. As such, majority write operations could possibly roll back in the event of a transient loss (e.g. crash and restart) of a majority of nodes in a given replica set.