Navigation

Collations

Collations are available in MongoDB 3.4 and later.

Overview

This guide shows you how to use collations, a set of sorting rules, to run operations using string ordering for specific languages and locales (a community or region that shares common language idioms).

MongoDB sorts strings using binary collation by default. This collation method uses the ASCII standard character values to compare and order strings. Languages and locales have specific character ordering conventions that differ from the ASCII standard.

For example, in Canadian French, the right-most accented character determines the ordering for strings when the other characters are the same. Consider the following French words: cote, coté, côte, and côté.

MongoDB sorts them in the following order using the default binary collation:

cote
coté
côte
côté

MongoDB sorts them in the following order using the Canadian French collation:

cote
côte
coté
côté

Usage

You can specify a collation when you create a new collection or new index. You can also specify a collation for CRUD operations and aggregations.

When you create a new collection with a collation, you define the default collation for any of the operations that support collation called on that collection. You can override the collation for an operation by specifying a different one.

note

Currently, you cannot create a collation on an existing collection. To use collations with an existing collection, create an index with the collation and specify the same collation in your operations on it.

When you create an index with a collation, you specify the sort order for operations that use that index. To use the collation in the index, you must provide a matching collation in the operation, and the operation must use the index. While most index types support collation, the following types support only binary comparison:

Collation Parameters

The collation object contains the following parameters:

collation: {
  locale: <string>,
  caseLevel: <bool>,
  caseFirst: <string>,
  strength: <int>,
  numericOrdering: <bool>,
  alternate: <string>,
  maxVariable: <string>,
  backwards: <bool>
}

You must specify the locale field in the collation; all other fields are optional. For a complete list of supported locales and the default values for the locale fields, see Supported Languages and Locales. For descriptions of each field, see the Collation Document MongoDB manual entry.

Collation Examples

Set a Default Collation on a Collection

In the following example, we create a new collection called souvenirs and assign a default collation with the "fr_CA" locale. The collation applies to all operations that support collation performed on that collection.

// Create the collection with a collation
db.createCollection("souvenirs", {
  collation: { locale: "fr_CA" },
});

Any of the operations that support collations automatically apply the collation defined on the collection. The query below searches the souvenirs collection and applies the "fr_CA" locale collation:

collection.find({type: "photograph"});

You can specify a different collation as a parameter in an operation that supports collations. The following query specifies the "is" Iceland locale and caseFirst optional parameter with the value "upper":

 collection.find({type: "photograph"},
   { collation: { locale: "is", caseFirst: "upper" } }
 );

Assign a Collation to an Index

In the following example, we create a new index on the title field of a collection with a collation set to the "en_US" locale.

collection.createIndex(
  { 'title' : 1 },
  { 'collation' : { 'locale' : 'en_US' } });

The following query uses the index we created:

collection.find({"year": 1980}, {"collation" : {"locale" : "en_US" }})
  .sort({"title": -1});

The following queries do not use the index that we created. The first query does not include a collation and the second contains a different strength value than the collation on the index.

// no collation specified
collection.find({"year": 1980})
  .sort({"title": -1});

// collation differs from the one on the index
collection.find({"year": 1980}, {"collation" : {"locale" : "en_US", "strength": 2 }})
  .sort({"title": -1});

Collation Query Examples

Operations that read, update, and delete documents from a collection can use collations. This section includes examples of a selection of these. See the MongoDB manual for a full list of operations that support collation.

find() and sort() Example

The following example calls both find() and sort() on a collection that uses the default binary collation. We use the German collation by setting the value of the locale parameter to de.

collection.find({ city: "New York" }, { collation: { locale: "de" } })
  .sort({ name: 1 });

findOneAndUpdate() Example

The following example calls the findOneAndUpdate() operation on a collection that uses the default binary collation. The collection contains the following documents:

{ "_id" : 1, "first_name" : "Hans" }
{ "_id" : 2, "first_name" : "Gunter" }
{ "_id" : 3, "first_name" : "Günter" }
{ "_id" : 4, "first_name" : "Jürgen" }

Consider the following findOneAndUpdate() operation on this collection which does not specify a collation:

collection.findOneAndUpdate(
  { first_name : { $lt: "Gunter" } },
  { $set: { verified: true } }
);

Since "Gunter" is the first sorted result when using a binary collation, none of the documents come lexically before and match the $lt comparison operator in the query document. As a result, the operation does not update any documents.

Consider the same operation with a collation specified with the locale set to de@collation=phonebook. This locale specifies the collation=phonebook option which contains rules for prioritizing proper nouns, identified by capitalization of the first letter. The de@collation=phonebook locale and option sorts characters with umlauts before the same characters without umlauts.

collection.findOneAndUpdate(
  { first_name: { $lt: "Gunter" } },
  { $set: { verified: true } },
  { collation: { locale: "de@collation=phonebook" } },
);

Since "Günter" lexically comes before "Gunter" using the de@collation=phonebook collation specified in findOneAndUpdate(), the operation returns the following updated document:

{ lastErrorObject: { updatedExisting: true, n: 1 },
  value: { _id: 3, first_name: 'Günter' },
  ok: 1 }

findOneAndDelete() Example

The following example calls the findOneAndDelete() operation on a collection that uses the default binary collation and contains the following documents:

{ "_id" : 1, "a" : "16" }
{ "_id" : 2, "a" : "84" }
{ "_id" : 3, "a" : "179" }

In this example, we set the numericOrdering collation parameter to true to sort numeric strings based on their numerical order instead of their lexical order.

collection.findOneAndDelete(
  { a: { $gt: "100" } },
  { collation: { locale: "en", numericOrdering: true } },
);

After you run the operation above, the collection contains the following documents:

{ "_id" : 1, "a" : "16" }
{ "_id" : 2, "a" : "84" }

If you perform the same operation without collation on the original collection of three documents, it matches documents based on the lexical value of the strings ("16", "84", and "179"), and deletes the first document it finds that matches the query criteria.

await collection.findOneAndDelete({ a: { $gt: "100" } });

Since all the documents contain lexical values in the a field that match the criteria (greater than the lexical value of "100"), the operation removes the first result. After you run the operation above, the collection contains the following documents:

{ "_id" : 2, "a" : "84" }
{ "_id" : 3, "a" : "179" }

Aggregation Example

To use collation with the aggregate operation, pass the collation document in the options field, after the array of pipeline stages.

The following example shows an aggregation pipeline on a collection that uses the default binary collation. The aggregation groups the first_name field, counts the total number of results in each group, and sorts the results by the German phonebook (de@collation=phonebook locale) order.

note

You can specify only one collation on an aggregation.

collection.aggregate(
  [
    { $group: { "_id": "$first_name", "nameCount": { "$sum": 1 } } },
    { $sort: { "_id": 1 } },
  ],
  { collation: { locale: "de@collation=phonebook" } },
);
←  IndexesLogging →