- Administration >
- Administration Concepts >
- Operational Strategies >
- Monitoring for MongoDB
Monitoring for MongoDB¶
On this page
Monitoring is a critical component of all database administration. A firm grasp of MongoDB’s reporting will allow you to assess the state of your database and maintain your deployment without crisis. Additionally, a sense of MongoDB’s normal operational parameters will allow you to diagnose before they escalate to failures.
This document presents an overview of the available monitoring utilities and the reporting statistics available in MongoDB. It also introduces diagnostic strategies and suggestions for monitoring replica sets and sharded clusters.
Note
MongoDB Cloud Manager is a hosted monitoring service which collects and aggregates diagnostic data to provide insight into the performance and operation of MongoDB deployments. See MongoDB Cloud Manager and the MongoDB Cloud Manager documentation for more information.
Monitoring Strategies¶
There are three methods for collecting data about the state of a running MongoDB instance:
- First, there is a set of utilities distributed with MongoDB that provides real-time reporting of database activities.
- Second, database commands return statistics regarding the current database state with greater fidelity.
- Third, MongoDB Cloud Manager collects data from running MongoDB deployments and provides visualization and alerts based on that data.
Each strategy can help answer different questions and is useful in different contexts. These methods are complementary.
MongoDB Reporting Tools¶
This section provides an overview of the reporting methods distributed with MongoDB. It also offers examples of the kinds of questions that each method is best suited to help you address.
Utilities¶
The MongoDB distribution includes a number of utilities that quickly return statistics about instances’ performance and activity. Typically, these are most useful for diagnosing issues and assessing normal operation.
mongostat
¶
mongostat
captures and returns the counts of database
operations by type (e.g. insert, query, update, delete, etc.). These
counts report on the load distribution on the server.
Use mongostat
to understand the distribution of operation types
and to inform capacity planning. See the mongostat manual for details.
mongotop
¶
mongotop
tracks and reports the current read and write
activity of a MongoDB instance, and reports these statistics on a per
collection basis.
Use mongotop
to check if your database activity and use
match your expectations. See the mongotop manual for details.
REST Interface¶
MongoDB provides a simple REST interface that can be useful for configuring monitoring and alert scripts, and for other administrative tasks.
To enable, configure mongod
to use REST, either by
starting mongod
with the --rest
option,
or by setting the rest
setting to true
in a
configuration file.
For more information on using the REST Interface see, the Simple REST Interface documentation.
HTTP Console¶
MongoDB provides a web interface that exposes diagnostic
and monitoring information in a simple web page. The web interface is
accessible at localhost:<port>
, where the
<port>
number is 1000 more than the mongod
port .
For example, if a locally running mongod
is using the
default port 27017
, access the HTTP console at
http://localhost:28017
.
Commands¶
MongoDB includes a number of commands that report on the state of the database.
These data may provide a finer level of granularity than the utilities
discussed above. Consider using their output in scripts and programs to
develop custom alerts, or to modify the behavior of your application in
response to the activity of your instance. The db.currentOp
method is another useful tool for identifying the database instance’s
in-progress operations.
serverStatus
¶
The serverStatus
command, or db.serverStatus()
from the shell, returns a general overview of the status of the
database, detailing disk usage, memory use, connection, journaling,
and index access. The command returns quickly and does not impact
MongoDB performance.
serverStatus
outputs an account of the state of a MongoDB
instance. This command is rarely run directly. In most cases, the data
is more meaningful when aggregated, as one would see with monitoring
tools including MongoDB Cloud Manager.
Nevertheless, all administrators should be familiar with the data
provided by serverStatus
.
dbStats
¶
The dbStats
command, or db.stats()
from the shell,
returns a document that addresses storage use and data volumes. The
dbStats
reflect the amount of
storage used, the quantity of data contained in the database, and
object, collection, and index counters.
Use this data to monitor the state and storage capacity of a specific database. This output also allows you to compare use between databases and to determine the average document size in a database.
collStats
¶
The collStats
provides
statistics that resemble dbStats
on the collection level,
including a count of the objects in the collection, the size of
the collection, the amount of disk space used by the collection, and
information about its indexes.
replSetGetStatus
¶
The replSetGetStatus
command (rs.status()
from
the shell) returns an overview of your replica set’s status. The replSetGetStatus document details the
state and configuration of the replica set and statistics about its members.
Use this data to ensure that replication is properly configured, and to check the connections between the current host and the other members of the replica set.
Third Party Tools¶
A number of third party monitoring tools have support for MongoDB, either directly, or through their own plugins.
Self Hosted Monitoring Tools¶
These are monitoring tools that you must install, configure and maintain on your own servers. Most are open source.
Tool | Plugin | Description |
---|---|---|
Ganglia | mongodb-ganglia | Python script to report operations per second, memory usage, btree statistics, master/slave status and current connections. |
Ganglia | gmond_python_modules | Parses output from the serverStatus and
replSetGetStatus commands. |
Motop | None | Realtime monitoring tool for MongoDB servers. Shows current operations ordered by durations every second. |
mtop | None | A top like tool. |
Munin | mongo-munin | Retrieves server statistics. |
Munin | mongomon | Retrieves collection statistics (sizes, index sizes, and each (configured) collection count for one DB). |
Munin | munin-plugins Ubuntu PPA | Some additional munin plugins not in the main distribution. |
Nagios | nagios-plugin-mongodb | A simple Nagios check script, written in Python. |
Also consider dex, an index and query analyzing tool for MongoDB that compares MongoDB log files and indexes to make indexing recommendations.
Hosted (SaaS) Monitoring Tools¶
These are monitoring tools provided as a hosted service, usually through a paid subscription.
Name | Notes |
---|---|
MongoDB Cloud Manager | MongoDB Cloud Manager is a cloud-based suite of services for managing MongoDB deployments. MongoDB Cloud Manager provides monitoring and backup functionality. |
Scout | Several plugins, including MongoDB Monitoring, MongoDB Slow Queries, and MongoDB Replica Set Monitoring. |
Server Density | Dashboard for MongoDB, MongoDB specific alerts, replication failover timeline and iPhone, iPad and Android mobile apps. |
Process Logging¶
During normal operation, mongod
and mongos
instances report a live account of all server activity and operations
to either
standard output or a log file. The following runtime settings
control these options.
quiet
. Limits the amount of information written to the log or output.verbose
. Increases the amount of information written to the log or output.You can also specify this as
v
(as in-v
). For higher levels of verbosity, set multiplev
, as invvvv = True
. You can also change the verbosity of a runningmongod
ormongos
instance with thesetParameter
command.logpath
. Enables logging to a file, rather than the standard output. You must specify the full path to the log file when adjusting this setting.logappend
. Adds information to a log file instead of overwriting the file.
Note
You can specify these configuration operations as the command line arguments to mongod or mongos
For example:
Starts a mongod
instance in verbose
mode,
appending data to the log file at
/var/log/mongodb/server1.log/
.
The following database commands also affect logging:
getLog
. Displays recent messages from themongod
process log.logRotate
. Rotates the log files formongod
processes only. See Rotate Log Files.
Diagnosing Performance Issues¶
Degraded performance in MongoDB is typically a function of the relationship between the quantity of data stored in the database, the amount of system RAM, the number of connections to the database, and the amount of time the database spends in a locked state.
In some cases performance issues may be transient and related to traffic load, data access patterns, or the availability of hardware on the host system for virtualized environments. Some users also experience performance limitations as a result of inadequate or inappropriate indexing strategies, or as a consequence of poor schema design patterns. In other situations, performance issues may indicate that the database may be operating at capacity and that it is time to add additional capacity to the database.
The following are some causes of degraded performance in MongoDB.
Locks¶
MongoDB uses a locking system to ensure data set validity. However, if
certain operations are long-running, or a queue forms, performance
will slow as requests and operations wait for the lock. Lock-related
slowdowns can be intermittent. To see if the lock has been affecting
your performance, look to the data in the
globalLock section of the serverStatus
output. If
globalLock.currentQueue.total
is consistently high,
then there is a chance that a large number of requests are waiting for
a lock. This indicates a possible concurrency issue that may be affecting
performance.
If globalLock.totalTime
is
high relative to uptime
, the database has
existed in a lock state for a significant amount of time.
Long queries are often the result of a number of factors: ineffective use of indexes, non-optimal schema design, poor query structure, system architecture issues, or insufficient RAM resulting in page faults and disk reads.
Memory Usage¶
MongoDB uses memory mapped files to store data. Given a data set of sufficient size, the MongoDB process will allocate all available memory on the system for its use. While this is part of the design, and affords MongoDB superior performance, the memory mapped files make it difficult to determine if the amount of RAM is sufficient for the data set.
The memory usage statuses metrics of the
serverStatus
output can provide insight into MongoDB’s
memory use. Check the resident memory use
(i.e. mem.resident
): if this
exceeds the amount of system memory and there is a significant amount
of data on disk that isn’t in RAM, you may have exceeded the capacity
of your system.
You should also check the amount of mapped memory (i.e. mem.mapped
.) If this value is greater than the amount
of system memory, some operations will require disk access page
faults to read data from virtual memory and negatively
affect performance.
Page Faults¶
A page fault occurs when MongoDB requires data
not located in physical memory, and must read from virtual memory. To
check for page faults, see the extra_info.page_faults
value in the
serverStatus
output. This data is only available on
Linux systems.
A single page fault completes quickly and is not problematic. However, in aggregate, large volumes of page faults typically indicate that MongoDB is reading too much data from disk. In many situations, MongoDB’s read locks will “yield” after a page fault to allow other processes to read and avoid blocking while waiting for the next page to read into memory. This approach improves concurrency, and also improves overall throughput in high volume systems.
Increasing the amount of RAM accessible to MongoDB may
help reduce the number of page faults. If this is not possible, you
may want to consider deploying a sharded cluster and/or
adding shards to your deployment to
distribute load among mongod
instances.
Number of Connections¶
In some cases, the number of connections between the application layer (i.e. clients) and the database can overwhelm the ability of the server to handle requests. This can produce performance irregularities. The following fields in the serverStatus document can provide insight:
globalLock.activeClients
contains a counter of the total number of clients with active operations in progress or queued.connections
is a container for the following two fields:
Note
Unless constrained by system-wide limits MongoDB has a hard connection
limit of 20,000 connections. You can modify system limits
using the ulimit
command, or by editing your system’s
/etc/sysctl
file.
If requests are high because there are numerous concurrent application
requests, the database may have trouble keeping up with demand. If
this is the case, then you will need to increase the capacity of your
deployment. For read-heavy applications increase the size of your
replica set and distribute read operations to
secondary members. For write heavy applications, deploy
sharding and add one or more shards to a
sharded cluster to distribute load among mongod
instances.
Spikes in the number of connections can also be the result of application or driver errors. All of the officially supported MongoDB drivers implement connection pooling, which allows clients to use and reuse connections more efficiently. Extremely high numbers of connections, particularly without corresponding workload is often indicative of a driver or other configuration error.
Database Profiling¶
MongoDB’s “Profiler” is a database profiling system that can help identify inefficient queries and operations.
The following profiling levels are available:
Level | Setting |
---|---|
0 | Off. No profiling |
1 | On. Only includes “slow” operations |
2 | On. Includes all operations |
Enable the profiler by setting the
profile
value using the following command in the
mongo
shell:
The slowms
setting defines what constitutes a “slow”
operation. To set the threshold above which the profiler considers
operations “slow” (and thus, included in the level 1
profiling
data), you can configure slowms
at runtime as an argument to
the db.setProfilingLevel()
operation.
See
The documentation of db.setProfilingLevel()
for more
information about this command.
By default, mongod
records all “slow” queries to its
log
, as defined by slowms
.
Note
Because the database profiler can negatively impact performance, only enable profiling for strategic intervals and as minimally as possible on production systems.
You may enable profiling on a per-mongod
basis. This
setting will not propagate across a replica set or
sharded cluster.
You can view the output of the profiler in the system.profile
collection of your database by issuing the show profile
command in
the mongo
shell, or with the following operation:
This returns all operations that lasted longer than 100 milliseconds.
Ensure that the value specified here (100
, in this example) is above the
slowms
threshold.
See also
Optimization Strategies for MongoDB addresses strategies that may improve the performance of your database queries and operations.
Replication and Monitoring¶
Beyond the basic monitoring requirements for any MongoDB instance, for replica sets, administrators must monitor replication lag. “Replication lag” refers to the amount of time that it takes to copy (i.e. replicate) a write operation on the primary to a secondary. Some small delay period may be acceptable, but two significant problems emerge as replication lag grows:
First, operations that occurred during the period of lag are not replicated to one or more secondaries. If you’re using replication to ensure data persistence, exceptionally long delays may impact the integrity of your data set.
Second, if the replication lag exceeds the length of the operation log (oplog) then MongoDB will have to perform an initial sync on the secondary, copying all data from the primary and rebuilding all indexes. This is uncommon under normal circumstances, but if you configure the oplog to be smaller than the default, the issue can arise.
Note
The size of the oplog is only configurable during the first run using the
--oplogSize
argument to themongod
command, or preferably, theoplogSize
in the MongoDB configuration file. If you do not specify this on the command line before running with the--replSet
option,mongod
will create a default sized oplog.By default, the oplog is 5 percent of total available disk space on 64-bit systems. For more information about changing the oplog size, see the Change the Size of the Oplog
For causes of replication lag, see Replication Lag.
Replication issues are most often the result of network connectivity
issues between members, or the result of a primary that does not
have the resources to support application and replication traffic. To
check the status of a replica, use the replSetGetStatus
or
the following helper in the shell:
The replSetGetStatus document provides a more in-depth
overview view of this output. In general, watch the value of
optimeDate
, and pay particular attention
to the time difference between the primary and the
secondary members.
Sharding and Monitoring¶
In most cases, the components of sharded clusters benefit from the same monitoring and analysis as all other MongoDB instances. In addition, clusters require further monitoring to ensure that data is effectively distributed among nodes and that sharding operations are functioning appropriately.
See also
See the Sharding Concepts documentation for more information.
Config Servers¶
The config database maintains a map identifying which
documents are on which shards. The cluster updates this map as
chunks move between shards. When a configuration
server becomes inaccessible, certain sharding operations become
unavailable, such as moving chunks and starting mongos
instances. However, clusters remain accessible from already-running
mongos
instances.
Because inaccessible configuration servers can seriously impact
the availability of a sharded cluster, you should monitor your
configuration servers to ensure that the cluster remains well
balanced and that mongos
instances can restart.
MongoDB Cloud Manager monitors config servers and can create notifications if a config server becomes inaccessible. See the MongoDB Cloud Manager documentation for more information.
Balancing and Chunk Distribution¶
The most effective sharded cluster deployments evenly balance chunks among the shards. To facilitate this, MongoDB has a background balancer process that distributes data to ensure that chunks are always optimally distributed among the shards.
Issue the db.printShardingStatus()
or sh.status()
command to the mongos
by way of the mongo
shell. This returns an overview of the entire cluster including the
database name, and a list of the chunks.
Stale Locks¶
In nearly every case, all locks used by the balancer are automatically
released when they become stale. However, because any long lasting
lock can block future balancing, it’s important to ensure that all
locks are legitimate. To check the lock status of the database,
connect to a mongos
instance using the mongo
shell. Issue the following command sequence to switch to the
config
database and display all outstanding locks on the shard database:
For active deployments, the above query can provide insights.
The balancing process, which originates on a randomly selected
mongos
, takes a special “balancer” lock that prevents other
balancing activity from transpiring. Use the following command, also
to the config
database, to check the status of the “balancer”
lock.
If this lock exists, make sure that the balancer process is actively using this lock.