- Tutorials >
- Collation
Collation
On this page
New in version 1.1.
Overview
MongoDB 3.4 introduced support for collations, which provide a set of rules to comply with the conventions of a particular language when comparing strings.
For example, in Canadian French, the last accent in a given word determines the sorting order. Consider the following French words:
The sort order using the Canadian French collation would result in the following:
If collation is unspecified, MongoDB uses simple binary comparison for strings. As such, the sort order of the words would be:
Usage
You can specify a default collation for collections and indexes when they are created, or specify a collation for CRUD operations and aggregations. For operations that support collation, MongoDB uses the collection’s default collation unless the operation specifies a different collation.
Collation Parameters
The only required parameter is locale
, which the server parses as an ICU
format locale ID. For example, set
locale
to en_US
to represent US English or fr_CA
to represent
Canadian French.
For a complete description of the available parameters, see Collation Document in the MongoDB manual.
Assign a Default Collation to a Collection
The following example creates a new collection called contacts
on the
test
database and assigns a default collation with the fr_CA
locale.
Specifying a collation when you create the collection ensures that all
operations involving a query that are run against the contacts
collection
use the fr_CA
collation, unless the query specifies another collation. Any
indexes on the new collection also inherit the default collation, unless the
creation command specifies another collation.
Assign a Collation to an Index
To specify a collation for an index, use the collation
option when you
create the index.
The following example creates an index on the name
field of the
address_book
collection, with the unique
parameter enabled and a default
collation with locale
set to en_US
.
To use this index, make sure your queries also specify the same collation. The following query uses the above index:
The following queries do NOT use the index. The first query uses no
collation, and the second uses a collation with a different strength
value
than the collation on the index.
Operations that Support Collation
All reading, updating, and deleting methods support collation. Some examples are listed below.
find()
with sort
Individual queries can specify a collation to use when matching and sorting
results. The following query and sort operation uses a German collation with the
locale
parameter set to de
.
findOneAndUpdate()
A collection called names
contains the following documents:
The following findOneAndUpdate()
operation on the collection does not
specify a collation.
Because Gunter
is lexically first in the collection, the above operation
returns no results and updates no documents.
Consider the same findOneAndUpdate()
operation but with a collation
specified, which uses the locale de@collation=phonebook
.
Note
Some locales have a collation=phonebook
option available for use with
languages which sort proper nouns differently from other words. According to
the de@collation=phonebook
collation, characters with umlauts come before
the same characters without umlauts.
The operation returns the following updated document:
findOneAndDelete()
Set the numericOrdering
collation parameter to true
to compare numeric
strings by their numeric values.
The collection numbers
contains the following documents:
The following example matches the first document in which field a
has a
numeric value greater than 100 and deletes it.
After the above operation, the following documents remain in the collection:
If you perform the same operation without collation, the server deletes the
first document it finds in which the lexical value of a
is greater than
"100"
.
After the above operation is executed, the document in which a
was equal to
"16"
has been deleted, and the following documents remain in the collection:
deleteMany()
You can use collations with all the various CRUD operations which exist in the MongoDB PHP Library.
The collection recipes
contains the following documents:
Setting the strength
parameter of the collation document to 1
or 2
causes the server to disregard case in the query filter. The following example
uses a case-insensitive query filter to delete all records in which the
cuisine
field matches French
.
After the above operation runs, the documents with _id
values of 2
and
4
are deleted from the collection.
Aggregation
To use collation with an aggregate()
operation, specify a collation in the
aggregation options.
The following aggregation example uses a collection called names
and groups
the first_name
field together, counts the total number of results in each
group, and sorts the results by German phonebook order.