Navigation

Client-Side Field Level Encryption Guide

Who Is This Guide For?

This use case guide is an introduction to implementing automatic Client-Side Field Level Encryption using supported MongoDB drivers and is intended for full-stack developers. The guide presents the following information in the context of a real-world scenario:

Download the Code

For a runnable example of all the functionality demonstrated in this guide, see the Download Example Project section.

Introduction

Applications frequently use and store sensitive data such as confidential personal details, payment information, or proprietary data. In some jurisdictions, this type of data is subject to governance, privacy, and security compliance mandates. Unauthorized access of sensitive data or a failure to comply with a mandate often results in significant reputation damage and financial penalties. Therefore, it is important to keep sensitive data secure.

MongoDB offers several methods that protect your data from unauthorized access including:

Another MongoDB feature that prevents unauthorized access of data is Client-Side Field Level Encryption (CSFLE). This feature allows a developer to selectively encrypt individual fields of a document on the client-side before it is sent to the server. This keeps the encrypted data private from the providers hosting the database as well as any user that has direct access to the database.

This guide provides steps for setup and implementation of CSFLE with a practical example.

note

Automatic Client-Side Field Level Encryption is available starting in MongoDB 4.2 Enterprise only.

Scenario

In this scenario, we secure sensitive data on a Medical Care Management System which stores patients' personal information, insurance information, and medical records for a fictional company, MedcoMD. None of the patient data is public, and certain data such as their social security number (SSN, a US government-issued id number), insurance policy number, and vital sign measurements are particularly sensitive and subject to privacy compliance. It is important for the company and the patient that the data is kept private and secure.

MedcoMD needs this system to satisfy the following use cases:

  • Doctors use the system to access Patients' medical records, insurance information, and add new vital sign measurements.
  • Receptionists use the system to verify the Patients' identity, using a combination of their contact information and the last four digits of their Social Security Number (SSN).
  • Receptionists can view a Patient's insurance policy provider, but not their policy number.
  • Receptionists cannot access a Patient's medical records.

MedcoMD is also concerned with disclosure of sensitive data through any of the following methods:

  • Accidental disclosure of data on the Receptionist's publicly-viewable screen.
  • Direct access to the database by a superuser such as a database administrator.
  • Capture of data over an insecure network.
  • Access to the data by reading a server's memory.
  • Access to the on-disk data by reading database or backup files.

What can MedcoMD do to balance the functionality and access restrictions of their Medical Care Management System?

Comparison of Security Features

The MedcoMD engineers review the Medical Care Management System specification and research the proper solution for limiting access to sensitive data.

The first MongoDB security feature they evaluated was Role-Based Access Control which allows administrators to grant and restrict collection-level permissions for users. With the appropriate role definition and assignment, this solution prevents accidental disclosure of data and access. However, it does not prevent capture of the data over an insecure network, direct access of data by a superuser, access to data by reading the server's memory, or access to on-disk data by reading the database or backup files.

The next MongoDB security features they evaluated were Encryption at Rest which encrypts the database files on disk and Transport Encryption using TLS/SSL which encrypts data over the network. When applied together, these two features prevent access to on-disk database files as well as capture of the data on the network, respectively. When combined with Role-Based Access Control, these three security features offer near-comprehensive security coverage of the sensitive data, but lack a mechanism to prevent the data from being read from the server's memory.

Finally, the MedcoMD engineers discovered a feature that independently satisfies all the security criteria. Client-side Field Level Encryption allows the engineers to specify the fields of a document that should be kept encrypted. Sensitive data is transparently encrypted/decrypted by the client and only communicated to and from the server in encrypted form. This mechanism keeps the specified data fields secure in encrypted form on both the server and the network. While all clients have access to the non-sensitive data fields, only appropriately-configured CSFLE clients are able to read and write the sensitive data fields.

The following diagram is a list of MongoDB security features offered and the potential security vulnerabilities that they address:

Diagram that describes MongoDB security features and the potential vulnerabilities that they address

MedcoMD will provide Receptionists with a client that is not configured to access data encrypted with CSFLE. This will prevent them from viewing the sensitive fields and accidentally leaving them displayed on-screen in a public area. MedcoMD will provide Doctors with a client with CSFLE enabled which will allow them to access the sensitive data fields in the privacy of their own office.

Equipped with CSFLE, MedcoMD can keep their sensitive data secure and compliant to data privacy regulations with MongoDB.

Implementation

This section explains the following configuration and implementation details of CSFLE:

  • Software required to run your client and server in your local development environment.
  • Creation and validation of the encryption keys.
  • Configuration of the client for automatic field-level encryption.
  • Queries, reads, and writes of encrypted fields.

Requirements

MongoDB Server 4.2 Enterprise
MongoDB Driver Compatible with CSFLE
File System Permissions
  • The client application or a privileged user needs permissions to start the mongocryptd process on the host.
Additional Dependencies
  • Additional dependencies for specific language drivers are required to use CSFLE or run through examples in this guide. To see the list, select the appropriate driver tab below.
Dependency NameDescription
JDK 8 or laterWhile the current driver is compatible with older versions of the JDK, the CSFLE feature is only compatible with JDK 8 and later.
libmongocryptThe library contains bindings to communicate with the native library that manages the encryption.
Dependency NameDescription
mongodb-client-encryptionNodeJS wrapper for the libmongocrypt encryption library.
uuid-base64Convert between Base64 and hexadecimal UUIDs.
Dependency NameDescription
pymongocryptPython wrapper for the libmongocrypt encryption library.

A. Create a Master Key

MongoDB Client-Side Field Level Encryption (CSFLE) uses an encryption strategy called envelope encryption in which keys used to encrypt/decrypt data (called data encryption keys) are encrypted with another key (called the master key). For more information on the features of envelope encryption and key management concepts, see AWS Key Management Service Concepts.

In this step, we create and store the master key, used by the MongoDB driver to encrypt data encryption keys, in the Local Key Provider which is the filesystem in our local development environment. We refer to this key as the "locally-managed master key" in this guide.

The following diagram shows how the master key is created and stored:

Diagram that describes creating the master key when using a local provider

The data encryption keys, generated and used by the MongoDB driver to encrypt and decrypt document fields, are stored in a key vault collection in the same MongoDB replica set as the encrypted data.

Local Key Provider is not suitable for production

The Local Key Provider is an insecure method of storage and is therefore not recommended if you plan to use CSFLE in production. Instead, you should configure a master key in a Key Management System (KMS) which stores and decrypts your data encryption keys remotely.

To learn how to use a KMS in your CSFLE implementation, read the /use-cases/client-side-field-level-encryption-local-key-to-kms/ guide.

To begin development, MedcoMD engineers generate a master key and save it to a file with the fully runnable code below:

The following script generates a 96-byte locally-managed master key and saves it to a file called master-key.txt in the directory from which the script is executed.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import java.io.FileOutputStream; import java.io.IOException; import java.security.SecureRandom; public class CreateMasterKeyFile { public static void main(String[] args) throws IOException { byte[] localMasterKey = new byte[96]; new SecureRandom().nextBytes(localMasterKey); try (FileOutputStream stream = new FileOutputStream("master-key.txt")) { stream.write(localMasterKey); } } }

The following script generates a 96-byte locally-managed master key and saves it to a file called master-key.txt in the directory from which the script is executed.

1
2
3
4
5
6
7
8
const fs = require('fs'); const crypto = require('crypto'); try { fs.writeFileSync('master-key.txt', crypto.randomBytes(96)); } catch (err) { console.error(err); }

The following script generates a 96-byte locally-managed master key and saves it to a file called master-key.txt in the directory from which the script is executed.

1
2
3
4
5
6
import os path = "master-key.txt" file_bytes = os.urandom(96) with open(path, "wb") as f: f.write(file_bytes)

B. Create a Data Encryption Key

In this section, we generate a data encryption key. The MongoDB driver stores the key in a key vault collection where CSFLE-enabled clients can access the key for automatic encryption and decryption.

The following diagram shows how the data encryption keys are created and stored:

Diagram that describes creating the data encryption key when using a locally-managed master key

The client requires the following configuration values to generate a new data encryption key:

  • The locally-managed master key.
  • A MongoDB connection string that authenticates on a running server.
  • The key vault namespace (database and collection).

Follow the steps below to generate a single data encryption key from the locally-managed master key.

1

Read the Locally-Managed Master Key from a File

First, retrieve the contents of the master key file that you generated in the Create a Master Key section with the following code snippet:

1
2
3
4
5
6
7
String path = "master-key.txt"; byte[] localMasterKey= new byte[96]; try (FileInputStream fis = new FileInputStream(path)) { fis.readNBytes(localMasterKey, 0, 96); }

note

The FileInputStream#readNBytes method was introduced in Java 9. The helper method is used in this guide to keep the implementation concise. If you are using JDK 8, you may consider implementing a custom solution to read a file into a byte array.

1
2
const path = './master-key.txt'; const localMasterKey = fs.readFileSync(path);
1
2
3
path = "./master-key.txt" with open(path, "rb") as f: local_master_key = f.read()
2

Specify KMS Provider Settings

Next, specify the KMS provider settings. The client uses these settings to discover the master key. Set the provider name to local when using a local master key in the following code snippet:

The KMS provider settings are stored in a Map in order to use the kmsProviders helper method for the ClientEncryptionSettings Builder.

1
2
3
4
5
Map<String, Object> keyMap = new HashMap<String, Object>(); keyMap.put("key", localMasterKey); Map<String, Map<String, Object>> kmsProviders = new HashMap<String, Map<String, Object>>(); kmsProviders.put("local", keyMap);
1
2
3
4
5
const kmsProviders = { local: { key: localMasterKey, }, };
1
2
3
4
5
kms_providers = { "local": { "key": local_master_key # local_master_key variable from the previous step }, }
3

Create a Data Encryption Key

Construct a client with the MongoDB connection string and key vault namespace configuration, and create a data encryption key with the following code snippet. The key vault in this example uses the encryption database and __keyVault collection.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
String connectionString = "mongodb://localhost:27017"; String keyVaultNamespace = "encryption.__keyVault"; ClientEncryptionSettings clientEncryptionSettings = ClientEncryptionSettings.builder() .keyVaultMongoClientSettings(MongoClientSettings.builder() .applyConnectionString(new ConnectionString(connectionString)) .build()) .keyVaultNamespace(keyVaultNamespace) .kmsProviders(kmsProviders) .build(); ClientEncryption clientEncryption = ClientEncryptions.create(clientEncryptionSettings); BsonBinary dataKeyId = clientEncryption.createDataKey(kmsProvider, new DataKeyOptions()); System.out.println("DataKeyId [UUID]: " + dataKeyId.asUuid()); String base64DataKeyId = Base64.getEncoder().encodeToString(dataKeyId.getData()); System.out.println("DataKeyId [base64]: " + base64DataKeyId);

The createDataKey() method returns a BsonBinary object from which we can extract the UUID and Base64 representations of the key id.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
const base64 = require('uuid-base64'); const connectionString = 'mongodb://localhost:27017'; const keyVaultNamespace = 'encryption.__keyVault'; const client = new MongoClient(connectionString, { useNewUrlParser: true, useUnifiedTopology: true, }); async function main() { try { await client.connect(); const encryption = new ClientEncryption(client, { keyVaultNamespace, kmsProviders, }); const key = await encryption.createDataKey('local'); const base64DataKeyId = key.toString('base64'); const uuidDataKeyId = base64.decode(base64DataKeyId); console.log('DataKeyId [UUID]: ', uuidDataKeyId); console.log('DataKeyId [base64]: ', base64DataKeyId); } finally { await client.close(); } } main();

note

This code includes a dependency on the uuid-base64 npm package. See the npmjs documentation on the uuid-base64 package for installation instructions.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
from pymongo import MongoClient from pymongo.encryption_options import AutoEncryptionOpts from pymongo.encryption import ClientEncryption import base64 from bson.codec_options import CodecOptions from bson.binary import STANDARD, UUID connection_string = "mongodb://localhost:27017" key_vault_namespace = "encryption.__keyVault" client = MongoClient(connection_string) client_encryption = ClientEncryption( kms_providers, # pass in the kms_providers variable from the previous step key_vault_namespace, client, CodecOptions(uuid_representation=STANDARD) ) def create_data_encryption_key(): data_key_id = client_encryption.create_data_key("local") uuid_data_key_id = UUID(bytes=data_key_id) base_64_data_key_id = base64.b64encode(data_key_id) print("DataKeyId [UUID]: ", str(uuid_data_key_id)) print("DataKeyId [base64]: ", base_64_data_key_id) return data_key_id data_key_id = create_data_encryption_key()

The _id field of the data encryption key is represented as a UUID and is encoded in Base64 format. Use your Base64-encoded data key id when specified for the remainder of this guide.

The output from the code above should resemble the following:

1
2
DataKeyId [UUID]: de4d775a-4499-48bc-bb93-3f81c3c90704 DataKeyId [base64]: 3k13WkSZSLy7kwAAP4HDyQ==

note

Ensure that the client has ReadWrite permissions on the specified key vault namespace.

4

Verify that the Data Encryption Key was Created

Query the key vault collection for the data encryption key that was inserted as a document into your MongoDB replica set using the key id printed in the prior step with the following code snippet.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
String connectionString = "mongodb://localhost:27017"; String keyVaultDb = "encryption"; String keyVaultCollection = "__keyVault"; String base64KeyId = "3k13WkSZSLy7kwAAP4HDyQ=="; // use the base64 data key id returned by createKey() in the prior step MongoClient mongoClient = MongoClients.create(connectionString); MongoCollection<Document> collection = mongoClient.getDatabase(keyVaultDb).getCollection(keyVaultCollection); Bson query = Filters.eq("_id", new Binary((byte) 4, Base64.getDecoder().decode(base64KeyId))); Document doc = collection .find(query) .first(); System.out.println(doc);

This code example should print a retrieved document that resembles the following:

1
2
3
4
5
6
7
8
Document{{ _id=dad3a063-4f9b-48f8-bf4e-7ca9d323fd1c, keyMaterial=org.bson.types.Binary@40e1535, creationDate=Wed Sep 25 22:22:54 EDT 2019, updateDate=Wed Sep 25 22:22:54 EDT 2019, status=0, masterKey=Document{{provider=local}} }}

View the Extended JSON Representation of the Data Key

While the Document class is the Document type most commonly used to work with query results, we can use the BsonDocument class to view the data key document as extended JSON. Replace the Document assignment code with the following to retrieve and print a BsonDocument:

1
2
3
4
5
6
BsonDocument doc = collection .withDocumentClass(BsonDocument.class) .find(query) .first(); System.out.println(doc);
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
const connectionString = 'mongodb://localhost:27017/'; const keyVaultDb = 'encryption'; const keyVaultCollection = '__keyVault'; const base64KeyId = '3k13WkSZSLy7kwAAP4HDyQ=='; // use the base64 data key id returned by createKey() in the prior step const client = new MongoClient(connectionString, { useNewUrlParser: true, useUnifiedTopology: true, }); async function main() { try { await client.connect(); const keyDB = client.db(keyVaultDb); const keyColl = keyDB.collection(keyVaultCollection); const query = { _id: base64KeyId, }; const dataKey = await keyColl.findOne(query); console.log(dataKey); } finally { await client.close(); } } main();

This code example should print a retrieved document that resembles the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
{ _id: Binary { _bsontype: 'Binary', sub_type: 4, position: 16, buffer: <Buffer 68 ca d2 10 16 5d 45 bf 9d 1d 44 d4 91 a6 92 44> }, keyMaterial: Binary { _bsontype: 'Binary', sub_type: 0, position: 160, buffer: <Buffer f1 4a 9f bd aa ac c9 89 e9 b3 da 48 72 8e a8 62 97 2a 4a a0 d2 d4 2d a8 f0 74 9c 16 4d 2c 95 34 19 22 05 05 84 0e 41 42 12 1e e3 b5 f0 b1 c5 a8 37 b8 ... 110 more bytes> }, creationDate: 2019-09-25T22:22:54.017Z, updateDate: 2019-09-25T22:22:54.017Z, status: 0, masterKey: { provider: 'local' } }
1
2
3
4
5
6
7
8
9
10
11
from pprint import pprint connection_string = "mongodb://localhost:27017" key_vault_db = "encryption" key_vault_coll = "__keyVault" client = MongoClient(connection_string) key_vault = client[key_vault_db][key_vault_coll] # Pass in the data_key_id created in previous section key = key_vault.find_one({"_id": data_key_id}) pprint(key)

This code example should print a retrieved document that resembles the following:

1
2
3
4
5
6
7
8
9
10
{ "_id": UUID('1e83d013-d873-47df-abb1-e57898a72d4c'), "keyMaterial": b'\x96#\xeb\xa1xKA\xa7GM\xef\x08\xc04\'gD\x96\x9c\xa4\xd3?\xe3Db0H\xbb\x86\xa5\xc2\x1f\x14\x0f\xb8\xb8\x8a\x9d\xc9\xae\xa1g\xaf\xeb\x8b\x99\xb4b"\xc0\xe8e\x07\x1b.\xeet\xf5<%\xfd\x06Y\x15o=3Yk\x9fue\xd8V#X\xc1IB\xf5\xc9+\x95\xb1\x9c\xc0\x08U?\xaf\xb1U\xc6\x84\x89\x9b\xdc\x98\xc9~\xb2\xbd\xf6\\\xa2y\x08\xdf\x8f\xa1\x03\t9\xe7_+J>_H\xb4\x97up\x93Sc\x88\x0fG-+\x86\x95\x9e\xc2\x8es\x9e\xcb%%lVQ\xa2\xf1\xe3W\x83\x10]\xc9\x1fm\x7f\xbc\xbf\xd2d', "creationDate": datetime.datetime(2019, 9, 30, 20, 43, 10, 951000), "updateDate": datetime.datetime(2019, 9, 30, 20, 43, 10, 951000), "status": 0, "masterKey": { "provider": "local" } }

This retrieved document contains the following data:

  • Data encryption key id (stored as a UUID).
  • Data encryption key in encrypted form.
  • KMS provider information for the master key.
  • Other metadata such as creation and last modified date.

C. Specify Encrypted Fields Using JSON Schema

MongoDB drivers use an extended version of the JSON Schema standard to configure automatic client-side encryption and decryption of specific fields of the documents in a collection.

note

Automatic CSFLE requires MongoDB Enterprise or MongoDB Atlas.

The MongoDB CSFLE extended JSON Schema standard requires the following information:

  • The encryption algorithm to use when encrypting each field (Deterministic Encryption or Random Encryption)
  • One or more data encryption keys encrypted with the CSFLE master key
  • The BSON Type of each field (only required for deterministically encrypted fields)

CSFLE JSON Schema Does Not Support Document Validation

MongoDB drivers use JSON Schema syntax to specify encrypted fields and only support field-level encryption-specific keywords documented in Automatic Encryption JSON Schema Syntax. Any other document validation instances will cause the client to throw an error.

Server-side JSON Schema

You can prevent clients that are not configured with the appropriate client-side JSON Schema from writing unencrypted data to a field by using server-side JSON Schema. The server-side JSON Schema provides only supplemental enforcement of the client-side JSON Schema. For more details on server-side document validation implementation, see Enforce Field Level Encryption Schema.

The MedcoMD engineers receive specific requirements for the fields of data and their encryption strategies. The following table illustrates the data model of the Medical Care Management System.

Field typeEncryption AlgorithmBSON Type
NameNon-EncryptedString
SSNDeterministicInt
Blood TypeRandomString
Medical RecordsRandomArray
Insurance: Policy NumberDeterministicInt (embedded inside object)
Insurance: ProviderNon-EncryptedString (embedded inside object)

Data Encryption Key

The MedcoMD engineers created a single data key to use when encrypting all fields in the data model. To configure this, they specify the encryptMetadata key at the root level of the JSON Schema. As a result, all encrypted fields defined in the field of the schema will inherit this encryption key unless specifically overwritten.

1
2
3
4
5
6
7
8
9
{ "bsonType" : "object", "encryptMetadata" : { "keyId" : // copy and paste your keyId generated here }, "properties": { // copy and paste your field schemas here } }

MedcoMD engineers create JSON objects for each field and append them to the map.

SSN

The ssn field represents the patient's social security number. This field is sensitive and should be encrypted. MedcoMD engineers decide upon deterministic encryption based on the following properties:

  • Queryable
  • High cardinality
1
2
3
4
5
6
"ssn": { "encrypt": { "bsonType": "int", "algorithm": "AEAD_AES_256_CBC_HMAC_SHA_512-Deterministic" } }

Blood Type

The bloodType field represents the patient's blood type. This field is sensitive and should be encrypted. MedcoMD engineers decide upon random encryption based on the following properties:

  • No plans to query
  • Low cardinality
1
2
3
4
5
6
"bloodType": { "encrypt": { "bsonType": "string", "algorithm": "AEAD_AES_256_CBC_HMAC_SHA_512-Random" } }

Medical Records

The medicalRecords field is an array that contains a set of medical record documents. Each medical record document represents a separate visit and specifies information about the patient at that that time, such as their blood pressure, weight, and heart rate. This field is sensitive and should be encrypted. MedcoMD engineers decide upon random encryption based on the following properties:

  • Array fields must use random encryption with CSFLE to enable auto-encryption
1
2
3
4
5
6
"medicalRecords": { "encrypt": { "bsonType": "array", "algorithm": "AEAD_AES_256_CBC_HMAC_SHA_512-Random" } }

Insurance Policy Number

The insurance.policyNumber field is embedded inside the insurance field and represents the patient's policy number. This policy number is a distinct and sensitive field. MedcoMD engineers decide upon deterministic encryption based on the following properties:

  • Queryable
  • High cardinality
1
2
3
4
5
6
7
8
9
10
11
"insurance": { "bsonType": "object", "properties": { "policyNumber": { "encrypt": { "bsonType": "int", "algorithm": "AEAD_AES_256_CBC_HMAC_SHA_512-Deterministic" } } } }

Recap

MedcoMD engineers created a JSON Schema that satisfies their requirements of making sensitive data queryable and secure. View the full JSON Schema for the Medical Care Management System.

View the complete runnable helper code in Java.

View the complete runnable helper code in Javascript.

View the complete runnable helper code in Python.

D. Create the MongoDB Client

The MedcoMD engineers now have the JSON Schema and encryption keys necessary to create a CSFLE-enabled MongoDB client.

They build the client to communicate with a MongoDB cluster and perform actions such as securely reading and writing documents with encrypted fields.

About the Mongocryptd Application

The MongoDB client communicates with a separate encryption application called mongocryptd which automates the client-side field level encryption. This application is installed with MongoDB Enterprise Server (version 4.2 and later).

When we create a CSFLE-enabled MongoDB client, the mongocryptd process is automatically started by default, and handles the following responsibilities:

  • Validates the encryption instructions defined in the JSON Schema and flags the referenced fields for encryption in read and write operations.
  • Prevents unsupported operations from being executed on encrypted fields.

When the mongocryptd process is started with the client driver, you can provide configurable parameters including:

NameDescription
port
Listening port.
Specify this value in the AutoEncryptionSettings as follows:

example

1
2
3
4
5
6
7
8
9
List<String> spawnArgs = new ArrayList<String>(); spawnArgs.add("--port=30000"); Map<String, Object> extraOpts = new HashMap<String, Object>(); extraOpts.put("mongocryptdSpawnArgs", spawnArgs); AutoEncryptionSettings autoEncryptionSettings = AutoEncryptionSettings.builder() ... .extraOptions(extraOpts);
Default: 27020
idleShutdownTimeoutSecs
Number of idle seconds in which the mongocryptd process should wait before exiting.
Specify this value in the AutoEncryptionSettings as follows:

example

1
2
3
4
5
6
7
8
9
10
List<String> spawnArgs = new ArrayList<String>(); spawnArgs.add("--idleShutdownTimeoutSecs") .add("60"); Map<String, Object> extraOpts = new HashMap<String, Object>(); extraOpts.put("mongocryptdSpawnArgs", spawnArgs); AutoEncryptionSettings autoEncryptionSettings = AutoEncryptionSettings.builder() ... .extraOptions(extraOpts);
Default: 60
NameDescription
port
Listening port.
Specify this value as follows:

example

1
2
3
4
5
6
autoEncryption: { ... extraOptions: { mongocryptdSpawnArgs: ["--port", "30000"], mongocryptdURI: 'mongodb://localhost:30000', }

note

In the current version (3.3.4) of the NodeJS driver, you must specify the mongocryptdURI to match the listening port.

Default: 27020
idleShutdownTimeoutSecs
Number of idle seconds in which the mongocryptd process should wait before exiting.
Specify this value as follows:

example

1
2
3
4
5
autoEncryption: { ... extraOptions: { mongocryptdSpawnArgs: ["--idleShutdownTimeoutSecs", "75"] }
Default: 60
NameDescription
port
Listening port.
Specify this value as follows:

example

1
auto_encryption_opts = AutoEncryptionOpts(mongocryptd_spawn_args=['--port=30000'])
Default: 27020
idleShutdownTimeoutSecs
Number of idle seconds in which the mongocryptd process should wait before exiting.
Specify this value as follows:

example

1
auto_encryption_opts = AutoEncryptionOpts(mongocryptd_spawn_args=['--idleShutdownTimeoutSecs=75'])
Default: 60

note

If a mongocryptd process is already running on the port specified by the driver, the driver may log a warning and continue to operate without spawning a new process. Any settings specified by the driver only apply once the existing process exits and a new encrypted client attempts to connect.

For additional information on mongocryptd, refer to the mongocryptd manual page.

The MedcoMD engineers use the following procedure to configure and instantiate the MongoDB client:

1
Specify the Key Vault Collection Namespace

The key vault collection contains the data key that the client uses to encrypt and decrypt fields. MedcoMD uses the collection encryption.__keyVault as the key vault in the following code snippet.

1
String keyVaultNamespace = "encryption.__keyVault";
1
const keyVaultNamespace = 'encryption.__keyVault';
1
key_vault_namespace = "encryption.__keyVault"
2
Specify the Local Master Encryption Key

The client expects a key management system to store and provide the application's master encryption key. For now, MedcoMD only has a local master key, so they use the local KMS provider and specify the key inline with the following code snippet.

1
2
3
4
5
Map<String, Object> keyMap = new HashMap<String, Object>(); keyMap.put("key", localMasterKey); Map<String, Map<String, Object>> kmsProviders = new HashMap<String, Map<String, Object>>(); kmsProviders.put("local", keyMap);
1
2
3
4
5
const kmsProviders = { local: { key: localMasterKey, } }
1
2
3
4
5
kms_providers = { "local": { "key": local_master_key } }
3
Map the JSON Schema to the Patients Collection

The MedcoMD engineers assign their schema to a variable. The JSON Schema that MedcoMD defined doesn't explicitly specify the collection to which it applies. To assign the schema, they map it to the medicalRecords.patients collection namespace in the following code snippet:

1
2
HashMap<String, BsonDocument> schemaMap = new HashMap<String, BsonDocument>(); schemaMap.put("medicalRecords.patients", BsonDocument.parse(jsonSchema));
1
2
3
const patientSchema = { 'medicalRecords.patients': jsonSchema, }
1
2
3
patient_schema = { "medicalRecords.patients": json_schema }
4
Specify the Location of the Encryption Binary

MongoDB drivers communicate with the mongocryptd encryption binary to perform automatic client-side field level encryption. The mongocryptd process performs the following:

  • Validates the encryption instructions defined in the JSON Schema and flags the referenced fields for encryption in read and write operations.
  • Prevents unsupported operations from being executed on encrypted fields.

Configure the client to spawn the mongocryptd process by specifying the path to the binary using the following configuration options:

1
2
Map<String, Object> extraOptions = new HashMap<String, Object>(); extraOptions.put("mongocryptdSpawnPath", "/usr/local/bin/mongocryptd");

Encryption Binary Daemon

If the mongocryptd daemon is already running, you can configure the client to skip starting it by passing the following option:

1
extraOptions.put("mongocryptdBypassSpawn", true);
1
2
3
const extraOptions = { mongocryptdSpawnPath: '/usr/local/bin/mongocryptd', }

Encryption Binary Daemon

If the mongocryptd daemon is already running, you can configure the client to skip starting it by passing the following option:

1
extraOptions.mongocryptdBypassSpawn = true;
1
2
3
extra_options = { 'mongocryptd_spawn_path': '/usr/local/bin/mongocryptd' }

Encryption Binary Daemon

If the mongocryptd daemon is already running, you can configure the client to skip starting it by passing the following option:

1
extra_options['mongocryptd_bypass_spawn'] = True
5
Create the MongoClient

To create the CSFLE-enabled client, MedcoMD instantiates a standard MongoDB client object with the additional automatic encryption settings with the following code snippet:

1
2
3
4
5
6
7
8
9
10
11
MongoClientSettings clientSettings = MongoClientSettings.builder() .applyConnectionString(new ConnectionString("mongodb://localhost:27017")) .autoEncryptionSettings(AutoEncryptionSettings.builder() .keyVaultNamespace(keyVaultNamespace) .kmsProviders(kmsProviders) .schemaMap(schemaMap) .extraOptions(extraOptions) .build()) .build(); MongoClient mongoClient = MongoClients.create(clientSettings);
1
2
3
4
5
6
7
8
9
10
11
const secureClient = new MongoClient(connectionString, { useNewUrlParser: true, useUnifiedTopology: true, monitorCommands: true, autoEncryption: { keyVaultNamespace, kmsProviders, schemaMap: patientSchema, extraOptions: extraOptions, } });
1
2
3
4
5
6
7
fle_opts = AutoEncryptionOpts( kms_providers, key_vault_namespace, schema_map=patient_schema, **extra_options ) client = MongoClient(connection_string, auto_encryption_opts=fle_opts)

E. Perform Encrypted Read/Write Operations

The MedcoMD engineers now have a CSFLE-enabled client and can test that the client can perform queries that meet the requirements. Doctors should be able to read and write to all fields, and receptionists should only be allowed to read and write to non-sensitive fields.

Insert a Document with Encrypted Fields

The following diagram shows the steps taken by the client application and driver to perform a write of field-level encrypted data:

Diagram that shows the data flow for a write of field-level encrypted data

MedcoMD engineers write a function to create a new patient record with the following code snippet:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
public static void insertPatient( MongoCollection collection, String name, int ssn, String bloodType, ArrayList<Document> medicalRecords, int policyNumber, String provider ) { Document insurance = new Document() .append("policyNumber", policyNumber) .append("provider", provider); Document patient = new Document() .append("name", name) .append("ssn", ssn) .append("bloodType", bloodType) .append("medicalRecords", medicalRecords) .append("insurance", insurance); collection.insertOne(patient); }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
async function insertPatient(collection, name, ssn, bloodType, medicalRecords, policyNumber, provider) { try { const writeResult = await collection.insertOne({ name, ssn, bloodType, medicalRecords, insurance: { policyNumber, provider } }); } catch (writeError) { console.error('writeError occurred:', writeError); } }
1
2
3
4
5
6
7
8
9
10
11
12
13
def insert_patient(collection, name, ssn, blood_type, medical_records, policy_number, provider): insurance = { 'policyNumber': policy_number, 'provider': provider } doc = { 'name': name, 'ssn': ssn, 'bloodType': blood_type, 'medicalRecords': medical_records, 'insurance': insurance } collection.insert_one(doc)

When a CSFLE-enabled client inserts a new patient record into the Medical Care Management System, it automatically encrypts the fields specified in the JSON Schema. This operation creates a document similar to the following:

1
2
3
4
5
6
7
8
9
10
11
{ "_id": "5d7a7bbe6d58fd263b6d7315", "name": "Jon Doe", "ssn": "Ac+ZbPM+sk7gl7CJCcIzlRAQUJ+uo/0WhqX+KbTNdhqCszHucqXNiwqEUjkGlh7gK8pm2JhIs/P3//nkVP0dWu8pSs6TJnpfUwRjPfnI0TURzQ==", "bloodType": "As+ZbPM+sk7gl7CJCcIzlRACESwHCTCtK/lQV9kF6/LRoL3mh59gzBVA42vGBVfLIycYWpfAy7ZCi2eRGEgMX5CrGl259Wfu6Zf/ELBVqQDnyQ==", "medicalRecords": "As+ZbPM+sk7gl7CJCcIzlRAEFt249toVYOlvlC/79cAtQ5jvE/ukF1ZLxRZn1g0zBBtPnf6L0AFTKMVdNJnjMGPMTszYU58qRE9uMvCU05DVHYl8DJnbtGXXFRLJ7ElQOc=", "insurance": { "provider": "MaestCare", "policyNumber": "Ac+ZbPM+sk7gl7CJCcIzlRAQm7kFhN1hy3l7Wt3BSpBMbvVSuiaDsf3UPF9bvJLTEcC+Ka+3kZI4SVZinj4tyc5uDYeyh6+7phpKrQo4CHWyg==" } }

note

Clients that do not have CSFLE configured will insert unencrypted data. We recommend using server-side schema validation to enforce encrypted writes for fields that should be encrypted.

Query for Documents on a Deterministically Encrypted Field

The following diagram shows the steps taken by the client application and driver to query and decrypt field-level encrypted data:

Diagram that shows the data flow for querying and reading field-level encrypted data

You can run queries on documents with encrypted fields using standard MongoDB driver methods. When a doctor performs a query in the Medical Care Management System to search for a patient by their SSN, the driver decrypts the patient's data before returning it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
{ "_id": "5d6ecdce70401f03b27448fc", "name": "Jon Doe", "ssn": 241014209, "bloodType": "AB+", "medicalRecords": [ { "weight": 180, "bloodPressure": "120/80" } ], "insurance": { "provider": "MaestCare", "policyNumber": 123142 } }

note

For queries using a client that is not configured to use CSFLE, such as when receptionists in the Medical Care Management System search for a patient with their ssn, a null value is returned. A client without CSFLE configured cannot query on a sensitive field.

Query for Documents on a Randomly Encrypted Field

warning

You cannot directly query for documents on a randomly encrypted field, however you can use another field to find the document that contains an approximation of the randomly encrypted field data.

MedcoMD engineers determined that the fields they randomly encrypted would not be used to find patients records. Had this been required, for example, if the patient's ssn was randomly encrypted, MedcoMD engineers could have included another plain-text field called last4ssn that contains the last 4 digits of the ssn field. They could then query on this field as a proxy for the ssn.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
{ "_id": "5d6ecdce70401f03b27448fc", "name": "Jon Doe", "ssn": 241014209, "last4ssn": 4209, "bloodType": "AB+", "medicalRecords": [ { "weight": 180, "bloodPressure": "120/80" } ], "insurance": { "provider": "MaestCare", "policyNumber": 123142 } }

Summary

MedcoMD wanted to develop a system that securely stores sensitive medical records for their patients. They also wanted strong data access and security guarantees that do not rely on individual users. After researching the available options, MedcoMD determined that MongoDB Client-Side Field Level Encryption satisfies their requirements and decided to implement it in their application. To implement CSFLE they:

1. Created a Locally-Managed Master Encryption Key

A locally-managed master key allowed MedcoMD to rapidly develop the client application without external dependencies and avoid accidentally leaking sensitive production credentials.

2. Generated an Encrypted Data Key with the Master Key

CSFLE uses envelope encryption, so they generated a data key that encrypts and decrypts each field and then encrypted the data key using a master key. This allows MedcoMD to store the encrypted data key in MongoDB so that it is shared with all clients while preventing access to clients that don't have access to the master key.

3. Created a JSON Schema

CSFLE can automatically encrypt and decrypt fields based on a provided JSON Schema that specifies which fields to encrypt and how to encrypt them.

4. Tested and Validated Queries with the CSFLE Client

MedcoMD engineers tested their CSFLE implementation by inserting and querying documents with encrypted fields. They then validated that clients without CSFLE enabled could not read the encrypted data.

Additional Information

Download Example Project

To view and download a runnable example of CSFLE, select your driver below:

GitHub: Java CSFLE runnable example

GitHub: NodeJS CSFLE runnable example

GitHub: PyMongo CSFLE runnable example

Move to Production

In this guide, we stored the master key in your local filesystem. Since your data encryption keys would be readable by anyone that gains direct access to your master key, we strongly recommend that you use a more secure storage location such as a Key Management System (KMS).

For more information on securing your master key, see our /use-cases/client-side-field-level-encryption-local-key-to-kms/.

Further Reading

For more information on client-side field level encryption in MongoDB, check out the reference docs in the server manual:

For additional information on MongoDB CSFLE API, see the official Java driver documentation

For additional information on MongoDB CSFLE API, see the official Node.js driver documentation