- Schema Operations >
- Collations
Collations¶
On this page
Overview¶
New in version 3.4.
Collations are sets of rules for how to compare strings, typically in a particular natural language.
For example, in Canadian French, the last accent in a given word determines the sorting order.
Consider the following French words:
The sort order using the Canadian French collation would result in the following:
If collation is unspecified, MongoDB uses the simple binary comparison for strings. As such, the sort order of the words would be:
Usage¶
You can specify a default collation for collections and indexes when they are created, or specify a collation for CRUD operations and aggregations. For operations that support collation, MongoDB uses the collection’s default collation unless the operation specifies a different collation.
Collation Parameters¶
The only required parameter is locale
, which the server parses as
an ICU format locale ID.
For example, set locale
to en_US
to represent US English
or fr_CA
to represent Canadian French.
For a complete description of the available parameters, see the MongoDB manual entry.
Assign a Default Collation to a Collection¶
The following example creates a new collection
called contacts
on the test
database and assigns a default
collation with the fr_CA
locale. Specifying a collation when you
create the collection ensures that all operations involving a query
that are run against the
contacts
collection use the fr_CA
collation, unless the query
specifies another collation. Any indexes on the new collection also
inherit the default collation, unless the creation command specifies
another collation.
Assign a Collation to an Index¶
To specify a collation for an index, use the collation
option when you create the index.
The following example creates an index on the name
field of the address_book
collection, with the unique
parameter
enabled and a default collation with locale
set to en_US
.
To use this index, make sure your queries also specify the same collation. The following query uses the above index:
The following queries do NOT use the index. The first query uses no
collation, and the second uses a collation with a different strength
value than the collation on the index.
Operations that Support Collation¶
All reading, updating, and deleting methods support collation. Some examples are listed below.
find()
and sort()
¶
Individual queries can specify a collation to use when matching
and sorting results. The following query and sort operation uses
a German collation with the locale
parameter set to de
.
find_one_and_update()
¶
A collection called names
contains the following documents:
The following find_one_and_update
operation on the collection
does not specify a collation.
Because Gunter
is lexically first in the collection,
the above operation returns no results and updates no documents.
Consider the same find_one_and_update
operation but with the
collation specified. The locale is set to de@collation=phonebook
.
Note
Some locales have a collation=phonebook
option available for
use with languages which sort proper nouns differently from other
words. According to the de@collation=phonebook
collation,
characters with umlauts come before the same characters without
umlauts.
The operation returns the following updated document:
find_one_and_delete()
¶
Set the numericOrdering
collation parameter to true
to compare numeric string by their numeric values.
The collection numbers
contains the following documents:
The following example matches the first document in which field a
has a numeric value greater than 100 and deletes it.
After the above operation, the following documents remain in the collection:
If you perform the same operation without collation, the server deletes
the first document it finds in which the lexical value of a
is
greater than "100"
.
After the above operation the document in which a
was equal to
"16"
has been deleted, and the following documents remain in the
collection:
delete_many()
¶
You can use collations with all the various bulk operations which exist in the Ruby driver.
The collection recipes
contains the following documents:
Setting the strength
parameter of the collation document to 1
or 2
causes the server to disregard case in the query filter. The
following example uses a case-insensitive query filter
to delete all records in which the cuisine
field matches
French
.
After the above operation runs, the documents with _id
values of
2
and 4
are deleted from the collection.
Aggregation¶
To use collation with an aggregation operation, specify a collation in the aggregation options.
The following aggregation example uses a collection called names
and groups the first_name
field together, counts the total
number of results in each group, and sorts the
results by German phonebook order.