Text Indexes¶

On this page

Create Text Index
Wildcard Text Indexes
Supported Languages and Stop Words
sparse Property
Restrictions
Storage Requirements and Performance Costs
Text Search

New in version 2.4.

MongoDB provides text indexes to support text search of string content in documents of a collection.

text indexes can include any field whose value is a string or an array of string elements. To perform queries that access the text index, use the $text query operator.

Changed in version 2.6: MongoDB enables the text search feature by default. In MongoDB 2.4, you need to enable the text search feature manually to create text indexes and perform text search.

Create Text Index¶

To create a text index, use the db.collection.createIndex() method. To index a field that contains a string or an array of string elements, include the field and specify the string literal "text" in the index document, as in the following example:

copy

db.reviews.createIndex( { comments: "text" } )

A collection can have at most one text index.

However, you can specify multiple fields for the text index. For examples of creating text indexes on multiple fields, see Create a text Index and Wildcard Text Indexes.

Wildcard Text Indexes¶

To allow for text search on all fields with string content, use the wildcard specifier ($**) to index all fields in the collection that contain string content. Such an index can be useful with highly unstructured data if it is unclear which fields to include in the text index or for ad-hoc querying.

With a wildcard text index, MongoDB indexes every field that contains string data for each document in the collection. The following example creates a text index using the wildcard specifier:

copy

db.collection.createIndex( { "$**": "text" } )

Wildcard text indexes are text indexes on multiple fields. As such, you can assign weights to specific fields during index creation to control the ranking of the results. For more information using weights to control the results of a text search, see Control Search Results with Weights.

Wildcard text indexes, as with all text indexes, can be part of a compound indexes. For example, the following creates a compound index on the field a as well as the wildcard specifier:

copy

db.collection.createIndex( { a: 1, "$**": "text" } )

As with all compound text indexes, since the a precedes the text index key, in order to perform a $text search with this index, the query predicate must include an equality match conditions a. For information on compound text indexes, see Compound Text Indexes.

Supported Languages and Stop Words¶

MongoDB supports text search for various languages. text indexes drop language-specific stop words (e.g. in English, “the”, “an”, “a”, “and”, etc.) and uses simple language-specific suffix stemming. For a list of the supported languages, see Text Search Languages.

If you specify a language value of "none", then the text index uses simple tokenization with no list of stop words and no stemming.

For the Latin alphabet, text indexes are case insensitive for non-diacritics; i.e. case insensitive for [A-z]. For all other characters, text indexes treat them as distinct.

To specify a language for the text index, see Specify a Language for Text Index.

`sparse` Property¶

text indexes are sparse by default and ignores the sparse: true option. If a document lacks a text index field (or the field is null or an empty array), MongoDB does not add an entry for the document to the text index. For inserts, MongoDB inserts the document but does not add to the text index.

For a compound index that includes a text index key along with keys of other types, only the text index field determine whether the index references a document. The other keys do not determine whether the index references the documents or not.

Restrictions¶

Text Search and Hints¶

You cannot use hint() if the query includes a $text query expression.

Text Index and Sort¶

Sort operations cannot obtain sort order from a text index, even from a compound text index; i.e. sort operations cannot use the ordering in the text index.

Compound Index¶

A compound index can include a text index key in combination with ascending/descending index keys. However, these compound indexes have the following restrictions:

A compound text index cannot include any other special index types, such as multi-key or geospatial index fields.
If the compound text index includes keys preceding the text index key, to perform a $text search, the query predicate must include equality match conditions on the preceding keys.

See also Text Index and Sort for additional limitations.

For an example of a compound text index, see Limit the Number of Entries Scanned.

Drop a Text Index¶

To drop a text index, pass the name of the index to the db.collection.dropIndex() method. To get the name of the index, run the getIndexes() method.

For information on the default naming scheme for text indexes as well as overriding the default name, see Specify Name for text Index.

Storage Requirements and Performance Costs¶

text indexes have the following storage requirements and performance costs:

text indexes can be large. They contain one index entry for each unique post-stemmed word in each indexed field for each document inserted.
Building a text index is very similar to building a large multi-key index and will take longer than building a simple ordered (scalar) index on the same data.
When building a large text index on an existing collection, ensure that you have a sufficiently high limit on open file descriptors. See the recommended settings.
text indexes will impact insertion throughput because MongoDB must add an index entry for each unique post-stemmed word in each indexed field of each new source document.
Additionally, text indexes do not store phrases or information about the proximity of words in the documents. As a result, phrase queries will run much more effectively when the entire collection fits in RAM.

Text Search¶

Text search supports the search of string content in documents of a collection. MongoDB provides the $text operator to perform text search in queries and in aggregation pipelines.

The text search process:

tokenizes and stems the search term(s) during both the index creation and the text command execution.
assigns a score to each document that contains the search term in the indexed fields. The score determines the relevance of a document to a given search query.

The $text operator can search for words and phrases. The query matches on the complete stemmed words. For example, if a document field contains the word blueberry, a search on the term blue will not match the document. However, a search on either blueberry or blueberries will match.

For information and examples on various text search patterns, see the $text query operator. For examples of text search in aggregation pipeline, see Text Search in the Aggregation Pipeline.

← 2d Index Internals Hashed Index →

Text Indexes¶

Create Text Index¶

Wildcard Text Indexes¶

Supported Languages and Stop Words¶

sparse Property¶

Restrictions¶

Text Search and Hints¶

Text Index and Sort¶

Compound Index¶

Drop a Text Index¶

Storage Requirements and Performance Costs¶

Text Search¶

`sparse` Property¶