- Indexes >
- Index Concepts >
- Index Types >
- Text Indexes
Text Indexes¶
New in version 2.4.
MongoDB provides text
indexes to support text search of string
content in documents of a collection. text
indexes are
case-insensitive and can include any field whose value is a string or
an array of string elements. You can only access the text
index
with the text
command.
Important
- Before you can create a text index or run the text command, you need to manually enable the text search. See Enable Text Search for information on how to enable the text search feature.
- A collection can have at most one
text
index.
Create Text Index¶
To create a text
index, use the
db.collection.ensureIndex()
method. To index a field that
contains a string or an array of string elements, include the field and
specify the string literal "text"
in the index document, as in the
following example:
For examples of creating text
indexes on multiple fields, see
Create a text Index.
text
indexes drop language-specific stop words (e.g. in English,
“the,” “an,” “a,” “and,” etc.) and uses simple language-specific suffix
stemming. See Text Search Languages for the supported languages
and Specify a Language for Text Index for details on
specifying languages with text
indexes.
text
indexes can satisfy the filter
component of a text search.
For details, see
Create text Index to Satisfy the filter Component of Text Search.
Storage Requirements and Performance Costs¶
text
indexes have the following storage requirements and
performance costs:
text
indexes change the space allocation method for all future record allocations in a collection tousePowerOf2Sizes
.text
indexes can be large. They contain one index entry for each unique post-stemmed word in each indexed field for each document inserted.- Building a
text
index is very similar to building a large multi-key index and will take longer than building a simple ordered (scalar) index on the same data. - When building a large
text
index on an existing collection, ensure that you have a sufficiently high limit on open file descriptors. See the recommended settings. text
indexes will impact insertion throughput because MongoDB must add an index entry for each unique post-stemmed word in each indexed field of each new source document.- Additionally,
text
indexes do not store phrases or information about the proximity of words in the documents. As a result, phrase queries will run much more effectively when the entire collection fits in RAM.
Text Search¶
Text search supports the search of string content in documents of a
collection. MongoDB provides the text
command to perform
the text search. The text
command accesses the text
index.
The text search process:
- tokenizes and stems the search term(s) during both the index creation and the text command execution.
- assigns a score to each document that contains the search term in the indexed fields. The score determines the relevance of a document to a given search query.
By default, the text
command returns at most the top 100
matching documents as determined by the scores. The command can search
for words and phrases. The command matches on the complete stemmed
words. For example, if a document field contains the word
blueberry
, a search on the term blue
will not match the
document. However, a search on either blueberry
or blueberries
will match.
For information and examples on various text search patterns, see Search String Content for Text.