Navigation
This version of the documentation is archived and no longer supported.

FAQ: MongoDB Fundamentals

This document addresses basic high level questions about MongoDB and its use.

If you don’t find the answer you’re looking for, check the complete list of FAQs or post your question to the MongoDB User Mailing List.

What kind of database is MongoDB?

MongoDB is a document-oriented DBMS. Think of MySQL but with JSON-like objects comprising the data model, rather than RDBMS tables. Significantly, MongoDB supports neither joins nor transactions. However, it features secondary indexes, an expressive query language, and atomic writes on a per-document level.

Operationally, MongoDB features master-slave replication with automated failover and built-in horizontal scaling via automated range-based partitioning.

Note

MongoDB uses BSON, a binary object format similar to, but more expressive than JSON.

Do MongoDB databases have tables?

Instead of tables, a MongoDB database stores its data in collections, which are the rough equivalent of RDBMS tables. A collection holds one or more documents. Documents are analogous to records or rows in a relational database table. Each document has one or more fields; fields are similar to the columns in a relational database table.

Do MongoDB databases have schemas?

MongoDB uses dynamic schemas. You can create collections without defining the structure, i.e. the fields or the types of their values, of the documents in the collection. You can change the structure of documents simply by adding new fields or deleting existing ones. Documents in a collection need not have an identical set of fields.

In practice, it is common for the documents in a collection to have a largely homogeneous structure; however, this is not a requirement. MongoDB’s flexible schemas mean that schema migration and augmentation are very easy in practice, and you will rarely, if ever, need to write scripts that perform “alter table” type operations, which simplifies and facilitates iterative software development with MongoDB.

What languages can I use to work with MongoDB?

MongoDB client drivers exist for all of the most popular programming languages, and many other ones. See the latest list of drivers for details.

See also

Drivers.

Does MongoDB support SQL?

No.

However, MongoDB does support a rich, ad-hoc query language of its own.

See also

Operators

What are typical uses for MongoDB?

MongoDB has a general-purpose design, making it appropriate for a large number of use cases. Examples include content management systems, mobile applications, gaming, e-commerce, analytics, archiving, and logging.

Do not use MongoDB for systems that require SQL, joins, and multi-object transactions.

Does MongoDB support transactions?

MongoDB does not support multi-document transactions. However, MongoDB does provide atomic operations on a single document.

For more details on MongoDB’s isolation guarantees and behavior under concurrency, see FAQ: Concurrency.

Does MongoDB require a lot of RAM?

MMAPv1

Not necessarily. It’s certainly possible to run MongoDB on a machine with a small amount of free RAM.

MongoDB automatically uses all free memory on the machine as its cache. System resource monitors show that MongoDB uses a lot of memory, but its usage is dynamic. If another process suddenly needs half the server’s RAM, MongoDB will yield cached memory to the other process.

Technically, the operating system’s virtual memory subsystem manages MongoDB’s memory. This means that MongoDB will use as much free memory as it can, swapping to disk as needed. Deployments with enough memory to fit the application’s working data set in RAM will achieve the best performance.

WiredTiger

With WiredTiger, MongoDB utilizes both filesystem cache and WiredTiger cache. By default, the WiredTiger cache will use either 1GB or half of the installed physical RAM, whichever is larger.

MongoDB also automatically uses all free memory on the machine via the filesystem cache (data in the filesystem cache is compressed).

How do I configure the cache size?

MMAPv1
MongoDB has no configurable cache for the MMAPv1 storage engine. MongoDB uses all free memory on the system automatically by way of memory-mapped files. Operating systems use the same approach with their file system caches.
WiredTiger

For the WiredTiger storage engine storage engine, you can specify the maximum size of the cache that WiredTiger will use for all data. See storage.wiredTiger.engineConfig.cacheSizeGB and --wiredTigerCacheSizeGB.

The size of the cache is tunable through the storage.wiredTiger.engineConfig.cacheSizeGB setting. If the cache does not have enough space to load additional data, WiredTiger evicts pages from the cache to free up space.

Note

The storage.wiredTiger.engineConfig.cacheSizeGB only limits the size of the WiredTiger cache, not the total amount of memory used by mongod. The WiredTiger cache is only one component of the RAM used by MongoDB. MongoDB also automatically uses all free memory on the machine via the filesystem cache (data in the filesystem cache is compressed).

In addition, the operating system will use any free RAM to buffer filesystem blocks.

To accommodate the additional consumers of RAM, you may have to decrease WiredTiger cache size. Avoid increasing the WiredTiger cache size above its default value.

The default WiredTiger cache size value assumes that there is a single mongod instance per node. If a single node contains multiple instances, then you should decrease the setting to accommodate the other mongod instances.

If you run mongod in a container (e.g. lxc, cgroups, Docker, etc.) that does not have access to all of the RAM available in a system, you must set storage.wiredTiger.engineConfig.cacheSizeGB to a value less than the amount of RAM available in the container. The exact amount depends on the other processes running in the container.

Does MongoDB require a separate caching layer for application-level caching?

No. In MongoDB, a document’s representation in the database is similar to its representation in application memory. This means the database already stores the usable form of data, making the data usable in both the persistent store and in the application cache. This eliminates the need for a separate caching layer in the application.

This differs from relational databases, where caching data is more expensive. Relational databases must transform data into object representations that applications can read and must store the transformed data in a separate cache: if these transformation from data to application objects require joins, this process increases the overhead related to using the database which increases the importance of the caching layer.

Does MongoDB handle caching?

Yes. MongoDB keeps all of the most recently used data in RAM. If you have created indexes for your queries and your working data set fits in RAM, MongoDB serves all queries from memory.

MongoDB does not implement a query cache; i.e. MongoDB does not cache the query results in order to return the cached results for identical queries.

Are writes written to disk immediately or lazily?

MMAPv1

In the default configuration for the MMAPv1 storage engine, MongoDB writes to the data files on disk every 60 seconds and writes to the journal files roughly every 100 milliseconds.

To change the interval for writing to the data files, use the storage.syncPeriodSecs setting. For the journal files, see storage.journal.commitIntervalMs setting.

These values represent the maximum amount of time between the completion of a write operation and when MongoDB writes to the data files or to the journal files. In many cases MongoDB and the operating system flush data to disk more frequently, so that the above values represents a theoretical maximum.

While MongoDB writes to journal files promptly, MongoDB writes to the data files lazily. MongoDB may wait to write data to the data files for as much as one minute by default. This does not affect durability, as the journal has enough information to ensure crash recovery.

This “lazy” strategy provides advantages in situations where the database receives a thousand increments to an object within one second. With this “lazy” strategy, MongoDB only needs to flush this data to disk once. To modify this strategy, you can use fsync and Write Concern Reference as well as modify the aforementioned configuration options.

WiredTiger

For the data files, MongoDB creates checkpoints (i.e. write the snapshot data to disk) at intervals of 60 seconds or 2 gigabytes of data to write, depending on which occurs first. For the journal data,

  • WiredTiger sets checkpoints for journal data at intervals of 60 seconds or 2 GB of data, depending on which occurs first.
  • Because MongoDB uses a log file size limit of 100 MB, WiredTiger creates a new journal file approximately every 100MB of data. When WiredTiger creates a new journal file, WiredTiger syncs the previous journal file.
  • If the write operation includes a write concern of j:true, WiredTiger forces a sync on commit of that operation as well as anything that has happened before.

What language is MongoDB written in?

MongoDB is implemented in C++. Drivers and client libraries are typically written in their respective languages, although some drivers use C extensions for better performance.

What are the limitations of 32-bit versions of MongoDB?

Changed in version 3.0: Commercial support is no longer provided for MongoDB on 32-bit platforms (Linux and Windows). See Platform Support.

32-bit versions of MongoDB do not support the WiredTiger storage engine.

When running a 32-bit build of MongoDB, the total storage size for the server, including data and indexes, is 2 gigabytes. For this reason, do not deploy MongoDB to production on 32-bit machines.

If you’re running a 64-bit build of MongoDB, there’s virtually no limit to storage size. For production deployments, 64-bit builds and operating systems are strongly recommended.

Note

32-bit builds disable journaling by default because journaling further limits the maximum amount of data that the database can store.