Collation

On this page

Overview

Usage
Collation Parameters
Assign a Default Collation to a Collection
Assign a Collation to an Index
Operations that Support Collation
find() with sort
findOneAndUpdate()
findOneAndDelete()
deleteMany()
Aggregation

New in version 1.1.

Overview

MongoDB 3.4 introduced support for collations, which provide a set of rules to comply with the conventions of a particular language when comparing strings.

For example, in Canadian French, the last accent in a given word determines the sorting order. Consider the following French words:

cote < coté < côte < côté

The sort order using the Canadian French collation would result in the following:

cote < côte < coté < côté

If collation is unspecified, MongoDB uses simple binary comparison for strings. As such, the sort order of the words would be:

cote < coté < côte < côté

Usage

You can specify a default collation for collections and indexes when they are created, or specify a collation for CRUD operations and aggregations. For operations that support collation, MongoDB uses the collection's default collation unless the operation specifies a different collation.

Collation Parameters

'collation' => [
    'locale' => <string>,
    'caseLevel' => <boolean>,
    'caseFirst' => <string>,
    'strength' => <integer>,
    'numericOrdering' => <boolean>,
    'alternate' => <string>,
    'maxVariable' => <string>,
    'normalization' => <boolean>,
    'backwards' => <boolean>,
]

The only required parameter is locale, which the server parses as an ICU format locale ID. For example, set locale to en_US to represent US English or fr_CA to represent Canadian French.

For a complete description of the available parameters, see Collation Document in the MongoDB manual.

Assign a Default Collation to a Collection

The following example creates a new collection called contacts on the test database and assigns a default collation with the fr_CA locale. Specifying a collation when you create the collection ensures that all operations involving a query that are run against the contacts collection use the fr_CA collation, unless the query specifies another collation. Any indexes on the new collection also inherit the default collation, unless the creation command specifies another collation.

<?php
$database = (new MongoDB\Client)->test;
$database->createCollection('contacts', [
    'collation' => ['locale' => 'fr_CA'],
]);

Assign a Collation to an Index

To specify a collation for an index, use the collation option when you create the index.

The following example creates an index on the name field of the address_book collection, with the unique parameter enabled and a default collation with locale set to en_US.

<?php
$collection = (new MongoDB\Client)->test->address_book;
$collection->createIndex(
    ['first_name' => 1],
    [
        'collation' => ['locale' => 'en_US'],
        'unique' => true,
    ]
);

To use this index, make sure your queries also specify the same collation. The following query uses the above index:

<?php
$collection = (new MongoDB\Client)->test->address_book;
$cursor = $collection->find(
    ['first_name' => 'Adam'],
    [
        'collation' => ['locale' => 'en_US'],
    ]
);

The following queries do NOT use the index. The first query uses no collation, and the second uses a collation with a different strength value than the collation on the index.

<?php
$collection = (new MongoDB\Client)->test->address_book;
$cursor1 = $collection->find(['first_name' => 'Adam']);
$cursor2 = $collection->find(
    ['first_name' => 'Adam'],
    [
        'collation' => [
            'locale' => 'en_US',
            'strength' => 2,
        ],
    ]
);

Operations that Support Collation

All reading, updating, and deleting methods support collation. Some examples are listed below.

`find()` with `sort`

Individual queries can specify a collation to use when matching and sorting results. The following query and sort operation uses a German collation with the locale parameter set to de.

<?php
$collection = (new MongoDB\Client)->test->contacts;
$cursor = $collection->find(
    ['city' => 'New York'],
    [
        'collation' => ['locale' => 'de'],
        'sort' => ['name' => 1],
    ]
);

`findOneAndUpdate()`

A collection called names contains the following documents:

{ "_id" : 1, "first_name" : "Hans" }
{ "_id" : 2, "first_name" : "Gunter" }
{ "_id" : 3, "first_name" : "Günter" }
{ "_id" : 4, "first_name" : "Jürgen" }

The following findOneAndUpdate() operation on the collection does not specify a collation.

<?php
$collection = (new MongoDB\Client)->test->names;
$document = $collection->findOneAndUpdate(
    ['first_name' => ['$lt' => 'Gunter']],
    ['$set' => ['verified' => true]]
);

Because Gunter is lexically first in the collection, the above operation returns no results and updates no documents.

Consider the same findOneAndUpdate() operation but with a collation specified, which uses the locale de@collation=phonebook.

Note

Some locales have a collation=phonebook option available for use with languages which sort proper nouns differently from other words. According to the de@collation=phonebook collation, characters with umlauts come before the same characters without umlauts.

<?php
$collection = (new MongoDB\Client)->test->names;
$document = $collection->findOneAndUpdate(
    ['first_name' => ['$lt' => 'Gunter']],
    ['$set' => ['verified' => true]],
    [
        'collation' => ['locale' => 'de@collation=phonebook'],
        'returnDocument' => MongoDB\Operation\FindOneAndUpdate::RETURN_DOCUMENT_AFTER,
    ]
);

The operation returns the following updated document:

{ "_id" => 3, "first_name" => "Günter", "verified" => true }

`findOneAndDelete()`

Set the numericOrdering collation parameter to true to compare numeric strings by their numeric values.

The collection numbers contains the following documents:

{ "_id" : 1, "a" : "16" }
{ "_id" : 2, "a" : "84" }
{ "_id" : 3, "a" : "179" }

The following example matches the first document in which field a has a numeric value greater than 100 and deletes it.

<?php
$collection = (new MongoDB\Client)->test->numbers;
$document = $collection->findOneAndDelete(
    ['a' => ['$gt' =-> '100']],
    [
        'collation' => [
            'locale' => 'en',
            'numericOrdering' => true,
        ],
    ]
);

After the above operation, the following documents remain in the collection:

{ "_id" : 1, "a" : "16" }
{ "_id" : 2, "a" : "84" }

If you perform the same operation without collation, the server deletes the first document it finds in which the lexical value of a is greater than "100".

<?php
$collection = (new MongoDB\Client)->test->numbers;
$document = $collection->findOneAndDelete(['a' => ['$gt' =-> '100']]);

After the above operation is executed, the document in which a was equal to "16" has been deleted, and the following documents remain in the collection:

{ "_id" : 2, "a" : "84" }
{ "_id" : 3, "a" : "179" }

`deleteMany()`

You can use collations with all the various CRUD operations which exist in the MongoDB PHP Library.

The collection recipes contains the following documents:

{ "_id" : 1, "dish" : "veggie empanadas", "cuisine" : "Spanish" }
{ "_id" : 2, "dish" : "beef bourgignon", "cuisine" : "French" }
{ "_id" : 3, "dish" : "chicken molé", "cuisine" : "Mexican" }
{ "_id" : 4, "dish" : "chicken paillard", "cuisine" : "french" }
{ "_id" : 5, "dish" : "pozole verde", "cuisine" : "Mexican" }

Setting the strength parameter of the collation document to 1 or 2 causes the server to disregard case in the query filter. The following example uses a case-insensitive query filter to delete all records in which the cuisine field matches French.

<?php
$collection = (new MongoDB\Client)->test->recipes;
$collection->deleteMany(
    ['cuisine' => 'French'],
    [
        'collation' => [
            'locale' => 'en_US',
            'strength' => 1,
        ],
    ]
);

After the above operation runs, the documents with _id values of 2 and 4 are deleted from the collection.

Aggregation

To use collation with an aggregate() operation, specify a collation in the aggregation options.

The following aggregation example uses a collection called names and groups the first_name field together, counts the total number of results in each group, and sorts the results by German phonebook order.

<?php
$collection = (new MongoDB\Client)->test->names;
$cursor = $collection->aggregate(
    [
        ['$group' => ['_id' => '$first_name', 'name_count' => ['$sum' => 1]]],
        ['$sort' => ['_id' => 1]],
    ],
    [
        'collation' => ['locale' => 'de@collation=phonebook'],
    ]
);

← Codecs Execute Database Commands →

Collation.css-134mg1q{-webkit-align-self:center;-ms-flex-item-align:center;align-self:center;padding:0 10px;visibility:hidden;}.css-6vrlzm{border-radius:0!important;display:initial!important;margin:initial!important;}.css-1l4s55v{margin-top:-175px;position:absolute;padding-bottom:2px;}