Navigation
This version of the documentation is archived and no longer supported.

Query Cache

The MongoDB Ruby driver provides a built-in query cache. When enabled, the query cache saves the results of previously-executed find and aggregation queries. When those same queries are performed again, the driver returns the cached reuslts to prevent unnecessary roundtrips to the database.

Usage

The query cache is disabled by default. It can be enabled on the global scope as well as within the context of a specific block. The driver also provides a Rack middleware to enable the query cache automatically for each web request.

To enable the query cache globally:

Mongo::QueryCache.enabled = true

Similarly, to disable it globally:

Mongo::QueryCache.enabled = false

To enable the query cache within the context of a block:

Mongo::QueryCache.cache do
  Mongo::Client.new([ '127.0.0.1:27017' ], database: 'music') do |client|
    client['artists'].find(name: 'Flying Lotus').first
    #=> Queries the database and caches the result

    client['artists'].find(name: 'Flying Lotus').first
    #=> Returns the previously cached result
  end
end

And to disable the query cache in the context of a block:

Mongo::QueryCache.uncached do
  Mongo::Client.new([ '127.0.0.1:27017' ], database: 'music') do |client|
    client['artists'].find(name: 'Flying Lotus').first
    #=> Sends the query to the database; does NOT cache the result

    client['artists'].find(name: 'Flying Lotus').first
    #=> Queries the database again
  end
end

You may check whether the query cache is enabled at any time by calling Mongo::QueryCache.enabled?, which will return true or false.

Interactions With Fibers

The Query cache enablement flag is stored in fiber-local storage (using Thread.current. This, in principle, permits query cache state to be per fiber, although this is not currently tested.

There are methods in the Ruby standard library, like Enumerable#next, that utilize fibers in their implementation. These methods would not see the query cache enablement flag when it is set by the applications, and subsequently would not use the query cache. For example, the following code does not utilize the query cache despite requesting it:

Mongo::QueryCache.enabled = true

client['artists'].find({}, limit: 1).to_enum.next
# Issues the query again.
client['artists'].find({}, limit: 1).to_enum.next

Rewriting this code to use first instead of next would make it use the query cache:

Mongo::QueryCache.enabled = true

client['artists'].find({}, limit: 1).first
# Utilizes the cached result from the first query.
client['artists'].find({}, limit: 1).first

Query Matching

A query is eligible to use cached results if it matches the original query that produced the cached results. Two queries are considered matching if they are identical in the following values:

  • Namespace (the database and collection on which the query was performed)
  • Selector (for aggregations, the aggregation pipeline stages)
  • Skip
  • Sort
  • Projection
  • Collation
  • Read Concern
  • Read Preference

For example, if you perform one query, and then perform a mostly identical query with a different sort order, those queries will not be considered matching, and the second query will not use the cached results of the first.

Limits

When performing a query with a limit, the query cache will reuse an existing cached query with a larger limit if one exists. For example:

Mongo::QueryCache.cache do
  Mongo::Client.new([ '127.0.0.1:27017' ], database: 'music') do |client|
    client['artists'].find(genre: 'Rock', limit: 10)
    #=> Queries the database and caches the result

    client['artists'].find(genre: 'Rock', limit: 5)
    #=> Returns the first 5 results from the cached query

    client['artists'].find(genre: 'Rock', limit: 20)
    #=> Queries the database again and replaces the previously cached query results
  end
end

Cache Invalidation

The query cache is cleared in part or in full on every write operation. Most write operations will clear the results of any queries were performed on the same collection that is being written to. Some operations will clear the entire query cache.

The following operations will clear cached query results on the same database and collection (including during bulk writes):

  • insert_one
  • update_one
  • replace_one
  • update_many
  • delete_one
  • delete_many
  • find_one_and_delete
  • find_one_and_update
  • find_one_and_replace

The following operations will clear the entire query cache:

  • aggregation with $merge or $out pipeline stages
  • commit_transaction
  • abort_transaction

Manual Cache Invalidation

You may clear the query cache at any time with the following method:

Mongo::QueryCache.clear

This will remove all cached query results.

Transactions

Queries are cached within the context of a transaction, but the entire cache will be cleared when the transaction is committed or aborted.

Mongo::QueryCache.cache do
  Mongo::Client.new([ '127.0.0.1:27017' ], database: 'music') do |client|
    session = client.start_session

    session.with_transaction do
      client['artists'].insert_one({ name: 'Fleet Foxes' }, session: session)

      client['artists'].find({}, session: session).first
      #=> { name: 'Fleet Foxes' }
      #=> Queries the database and caches the result

      client['artists'].find({}, session: session).first
      #=> { name: 'Fleet Foxes' }
      #=> Returns the previously cached result

      session.abort_transaction
    end

    client['artists'].find.first
    #=> nil
    # The query cache was cleared on abort_transaction
  end
end

Note

Transactions are often performed with a “snapshot” read concern level. Keep in mind that a query with a “snapshot” read concern cannot return cached results from a query without the “snapshot” read concern, so it is possible that a transaction may not use previously cached queries.

To understand when a query will use a cached result, see the Query Matching section.

Aggregations

The query cache also caches the results of aggregation pipelines. For example:

Mongo::QueryCache.cache do
  Mongo::Client.new([ '127.0.0.1:27017' ], database: 'music') do |client|
    client['artists'].aggregate([ { '$match' => { name: 'Fleet Foxes' } } ]).first
    #=> Queries the database and caches the result

    client['artists'].aggregate([ { '$match' => { name: 'Fleet Foxes' } } ]).first
    #=> Returns the previously cached result
  end
end

Note

Aggregation results are cleared from the cache during every write operation, with no exceptions.

System Collections

MongoDB stores system information in collections that use the database.system.* namespace pattern. These are called system collections.

Data in system collections can change due to activity not triggered by the application (such as internal server processes) and as a result of a variety of database commands issued by the application. Because of the difficulty of determining when the cached results for system collections should be expired, queries on system collections bypass the query cache.

You may read more about system collections in the MongoDB documentation.

Note

Even when the query cache is enabled, query results from system collections will not be cached.

Query Cache Middleware

The driver provides a Rack middleware which enables the query cache for the duration of each web request. Below is an example of how to enable the query cache middleware in a Ruby on Rails application:

# config/application.rb

# Add Mongo::QueryCache::Middleware at the bottom of the middleware stack
# or before other middleware that queries MongoDB.
config.middleware.use Mongo::QueryCache::Middleware

Please refer to the Rails on Rack guide for more information about using Rack middleware in Rails applications.

←   Geospatial Search GridFS  →