Navigation
  • Use Cases >
  • Client Side Field Level Encryption Guide

Client Side Field Level Encryption Guide

Introduction

Many applications make use of sensitive data such as confidential personal details, payment information, or proprietary data. In some jurisdictions, this type of data is subject to governance, privacy, and security compliance mandates. Unauthorized access of sensitive data or a failure to comply with a mandate often results in significant reputation damage and financial penalties. Therefore, it is important to keep sensitive data secure.

MongoDB offers several methods that protect your data from unauthorized access including:

Another MongoDB feature that prevents unauthorized access of data is Client-Side Field Level Encryption (CSFLE). This feature allows a developer to selectively encrypt individual fields of a document on the client-side before it is sent to the server. This keeps the encrypted data private from the providers hosting the database as well as any user that has direct access to the database.

This guide provides steps for setup and implementation of CSFLE with a practical example.

Note

Automatic Client-Side Field Level Encryption is available starting in MongoDB 4.2 Enterprise only.

Problem

In this scenario, we secure sensitive data on a Medical Care Management System which stores patients’ personal information, insurance information, and medical records for a fictional company, MedcoMD. None of the patient data is public, and certain data such as their social security number (SSN), insurance policy number, and vital sign measurements are particularly sensitive and subject to privacy compliance. It is important for the company and the patient that the data is kept private and secure.

MedcoMD needs this system to satisfy the following use cases:

  • Doctors use the system to access Patients’ medical records, insurance information, and add new vital sign measurements.
  • Receptionists use the system to verify the Patients’ identity, using a combination of their contact information and the last four digits of their Social Security Number (SSN).
  • Receptionists can view a Patient’s insurance policy provider, but not their policy number.
  • Receptionists cannot access a Patient’s medical records.

MedcoMD is also concerned with disclosure of sensitive data through any of the following methods:

  • Accidental disclosure of data on the Receptionist’s publicly-viewable screen.
  • Direct access to the database by a superuser such as a database administrator.
  • Capture of data over an insecure network.
  • Access to the data by reading a server’s memory.
  • Access to the on-disk data by reading database or backup files.

What can MedcoMD do to balance the functionality and access restrictions of their Medical Care Management System?

Solution

The MedcoMD engineers review the Medical Care Management System specification and research the proper solution for limiting access to sensitive data.

The first MongoDB security feature they evaluated was Role-Based Access Control which allows administrators to grant and restrict collection-level permissions for users. With the appropriate role definition and assignment, this solution prevents accidental disclosure of data and access. However, it does not prevent capture of the data over an insecure network, direct access of data by a superuser, access to data by reading the server’s memory, or access to on-disk data by reading the database or backup files.

The next MongoDB security features they evaluated were Encryption at Rest which encrypts the database files on disk and Transport Encryption using TLS/SSL which encrypts data over the network. When applied together, these two features prevent access to on-disk database files as well as capture of the data on the network, respectively. When combined with Role-Based Access Control, these three security features offer near-comprehensive security coverage of the sensitive data, but lack a mechanism to prevent the data from being read from the server’s memory.

Finally, the MedcoMD engineers discovered a feature that independently satisfies all the security criteria. Client-side Field Level Encryption allows the engineers to specify the fields of a document that should be kept encrypted. Sensitive data is transparently encrypted/decrypted by the client and only communicated to and from the server in encrypted form. This mechanism keeps the specified data fields secure in encrypted form on both the server and the network. While all clients have access to the non-sensitive data fields, only appropriately-configured CSFLE clients are able to read and write the sensitive data fields.

MedcoMD will provide Receptionists with a client that is not configured to access data encrypted with CSFLE. This will prevent them from viewing the sensitive fields and accidentally leaving them displayed on-screen in a public area. MedcoMD will provide Doctors with a client with CSFLE enabled which will allow them to access the sensitive data fields in the privacy of their own office.

Equipped with CSFLE, MedcoMD can keep their sensitive data secure and compliant to data privacy regulations with MongoDB.

Procedure

Requirements

MongoDB Server 4.2 Enterprise
MongoDB Driver Compatible with CSFLE
AWS Key Management Service (KMS)
  • The client application will need the following AWS KMS permissions for the master key:

    To learn how to create an IAM user and assign these permissions, see AWS IAM Users .

  • Compatible MongoDB drivers currently include integration with AWS KMS. Support for other KMS providers may be added in the future.

File System Permissions
  • The client application or a privileged user needs permissions to start the mongocryptd process on the host.
Additional Dependencies
  • Additional dependencies for specific language drivers are required to use CSFLE or run through examples in this guide. To see the list, select the appropriate driver tab below.
Dependency Name Description
pymongocrypt Python wrapper for the libmongocrypt encryption library.
Dependency Name Description
JDK 8 or later While the current driver is compatible with older versions of the JDK, the CSFLE feature is only compatible with JDK 8 and later.
libmongocrypt The libmongocrypt library contains bindings to communicate with the native library that manages the encryption.
Dependency Name Description
mongodb-client-encryption NodeJS wrapper for the libmongocrypt encryption library.
uuid-base64 Convert between Base64 and hexadecimal UUIDs.

A. Create a Master Key

MongoDB Client-Side Field Level Encryption (CSFLE) uses an encryption strategy called envelope encryption in which keys used to encrypt/decrypt data (called data encryption keys) are encrypted with another key (called the master key). For more information on the features of envelope encryption and key management concepts, see AWS Key Management Service Concepts.

The master key, used by the MongoDB driver to create and encrypt data keys, should be stored remotely in a Key Management System. The data encryption keys, generated and used by the MongoDB driver to encrypt and decrypt document fields, are stored in a key vault collection in the same database as the encrypted data.

To begin development, MedcoMD engineers generate a locally-managed master key:

The following script generates a 96-byte locally-managed master key and saves it to a file called master-key.txt in the directory from which the script is executed.

import os

path = "master-key.txt"
file_bytes = os.urandom(96)
with open(path, "wb") as f:
  f.write(file_bytes)

The following script generates a 96-byte locally-managed master key and saves it to a file called master-key.txt in the directory from which the script is executed.

import java.io.FileOutputStream;
import java.io.IOException;
import java.security.SecureRandom;

public class CreateMasterKeyFile {
    public static void main(String[] args) throws IOException {

        byte[] localMasterKey = new byte[96];
        new SecureRandom().nextBytes(localMasterKey);

        try (FileOutputStream stream = new FileOutputStream("master-key.txt")) {
            stream.write(localMasterKey);
        }
    }
}

The following script generates a 96-byte locally-managed master key and saves it to a file called master-key.txt in the directory from which the script is executed.

const fs = require('fs');
const crypto = require('crypto');

try {
  fs.writeFileSync('master-key.txt', crypto.randomBytes(96));
} catch (err) {
  console.error(err);
}

B. Create a Data Encryption Key

In this section, we generate a data encryption key. The MongoDB driver stores the key in a key vault collection where CSFLE-enabled clients can access the key for automatic encryption and decryption.

The client requires the following configuration values to generate a new data encryption key:

  • The locally-managed master key or AWS KMS master key access settings.
  • A MongoDB connection string that authenticates on a running server.
  • The key vault namespace (database and collection).

Follow the steps below to generate a single data encryption key from the locally-managed master key.

You can also download the complete Node.js data key generation code example on GitHub.

Note

The code linked above includes a dependency on the mongo-client-encryption npm package. See the npmjs documentation on the mongodb-client-encryption package for installation instructions.

1

Read the Locally-Managed Master Key from a File

First, retrieve the contents of the local master key file that you generated in the Create a Master Key section:

path = "./master-key.txt"
with open(path, "rb") as f:
  local_master_key = f.read()
String path = "master-key.txt";

byte[] localMasterKey= new byte[96];

try (FileInputStream fis = new FileInputStream(path)) {
    fis.readNBytes(localMasterKey, 0, 96);
}

Note

The FileInputStream#readNBytes method was introduced in Java 9. The helper method is used in this guide to keep the implementation concise. If you are using JDK 8, you may consider implementing a custom solution to read a file into a byte array.

const path = './master-key.txt';
const localMasterKey = fs.readFileSync(path);
2

Specify KMS Provider Settings

Next, specify the KMS provider settings. The client uses these settings to discover the master key. Set the provider name to local when using a local master key:

 kms_providers = {
   "local": {
     "key": local_master_key # local_master_key variable from the previous step
   },
 }

The KMS provider settings are stored in a Map in order to use the kmsProviders helper method for the ClientEncryptionSettings Builder.

Map<String, Object> keyMap = new HashMap<String, Object>();
keyMap.put("key", localMasterKey);

Map<String, Map<String, Object>> kmsProviders = new HashMap<String, Map<String, Object>>();
kmsProviders.put("local", keyMap);
 const kmsProviders = {
   local: {
     key: localMasterKey,
   },
 };
3

Create a Data Encryption Key

Construct a client with the MongoDB connection string and key vault namespace configuration, and create a data encryption key. The key vault in this example uses the encryption database and __keyVault collection.

from pymongo import MongoClient
from pymongo.encryption_options import AutoEncryptionOpts
from pymongo.encryption import ClientEncryption
import base64
from bson.codec_options import CodecOptions
from bson.binary import STANDARD, UUID

connection_string = "mongodb://localhost:27017"
key_vault_namespace = "encryption.__keyVault"

client = MongoClient(connection_string)
client_encryption = ClientEncryption(
    kms_providers, # pass in the kms_providers variable from the previous step
    key_vault_namespace,
    client,
    CodecOptions(uuid_representation=STANDARD)
)


def create_data_encryption_key():
    data_key_id = client_encryption.create_data_key("local")
    uuid_data_key_id = UUID(bytes=data_key_id)
    base_64_data_key_id = base64.b64encode(data_key_id)
    print("DataKeyId [UUID]: ", str(uuid_data_key_id))
    print("DataKeyId [base64]: ", base_64_data_key_id)
    return data_key_id


data_key_id = create_data_encryption_key()
String connectionString = "mongodb://localhost:27017";
String keyVaultNamespace = "encryption.__keyVault";

ClientEncryptionSettings clientEncryptionSettings = ClientEncryptionSettings.builder()
    .keyVaultMongoClientSettings(MongoClientSettings.builder()
        .applyConnectionString(new ConnectionString(connectionString))
        .build())
    .keyVaultNamespace(keyVaultNamespace)
    .kmsProviders(kmsProviders)
    .build();

ClientEncryption clientEncryption = ClientEncryptions.create(clientEncryptionSettings);
BsonBinary dataKeyId = clientEncryption.createDataKey(kmsProvider, new DataKeyOptions());
System.out.println("DataKeyId [UUID]: " + dataKeyId.asUuid());

String base64DataKeyId = Base64.getEncoder().encodeToString(dataKeyId.getData());
System.out.println("DataKeyId [base64]: " + base64DataKeyId);

The createDataKey() method returns a BsonBinary object from which we can extract the UUID and Base64 representations of the key id.

const base64 = require('uuid-base64');

const connectionString = 'mongodb://localhost:27017';
const keyVaultNamespace = 'encryption.__keyVault';
const client = new MongoClient(connectionString, {
  useNewUrlParser: true,
  useUnifiedTopology: true,
});

async function main() {
  try {
    await client.connect();
    const encryption = new ClientEncryption(client, {
      keyVaultNamespace,
      kmsProviders,
    });
    const key = await encryption.createDataKey('local');
    const base64DataKeyId = key.toString('base64');
    const uuidDataKeyId = base64.decode(base64DataKeyId);
    console.log('DataKeyId [UUID]: ', uuidDataKeyId);
    console.log('DataKeyId [base64]: ', base64DataKeyId);
  } finally {
    await client.close();
  }
}
main();

Note

This code includes a dependency on the uuid-base64 npm package. See the npmjs documentation on the uuid-base64 package for installation instructions.

The _id field of the data encryption key is represented as a UUID and is encoded in Base64 format. Use your Base64-encoded data key id when specified for the remainder of this guide.

The output from the code above should resemble the following:

DataKeyId [UUID]: de4d775a-4499-48bc-bb93-3f81c3c90704
DataKeyId [base64]: 3k13WkSZSLy7kwAAP4HDyQ==

Note

Ensure that the client has ReadWrite permissions on the specified key vault namespace.

4

Verify that the Data Encryption Key was Created

Query the key vault collection for the data encryption key that was inserted as a document into your MongoDB replica set using the key id printed in the prior step.

from pprint import pprint
connection_string = "mongodb://localhost:27017"
key_vault_db = "encryption"
key_vault_coll = "__keyVault"

client = MongoClient(connection_string)
key_vault = client[key_vault_db][key_vault_coll]

# Pass in the data_key_id created in previous section
key = key_vault.find_one({"_id": data_key_id})
pprint(key)

This code example should print a retrieved document that resembles the following:

{
  "_id": UUID('1e83d013-d873-47df-abb1-e57898a72d4c'),
  "keyMaterial": b'\x96#\xeb\xa1xKA\xa7GM\xef\x08\xc04\'gD\x96\x9c\xa4\xd3?\xe3Db0H\xbb\x86\xa5\xc2\x1f\x14\x0f\xb8\xb8\x8a\x9d\xc9\xae\xa1g\xaf\xeb\x8b\x99\xb4b"\xc0\xe8e\x07\x1b.\xeet\xf5<%\xfd\x06Y\x15o=3Yk\x9fue\xd8V#X\xc1IB\xf5\xc9+\x95\xb1\x9c\xc0\x08U?\xaf\xb1U\xc6\x84\x89\x9b\xdc\x98\xc9~\xb2\xbd\xf6\\\xa2y\x08\xdf\x8f\xa1\x03\t9\xe7_+J>_H\xb4\x97up\x93Sc\x88\x0fG-+\x86\x95\x9e\xc2\x8es\x9e\xcb%%lVQ\xa2\xf1\xe3W\x83\x10]\xc9\x1fm\x7f\xbc\xbf\xd2d',
  "creationDate": datetime.datetime(2019, 9, 30, 20, 43, 10, 951000),
  "updateDate": datetime.datetime(2019, 9, 30, 20, 43, 10, 951000),
  "status": 0,
  "masterKey": {
    "provider": "local"
  }
}
String connectionString = "mongodb://localhost:27017";
String keyVaultDb = "encryption";
String keyVaultCollection = "__keyVault";
String base64KeyId = "3k13WkSZSLy7kwAAP4HDyQ=="; // use the base64 data key id returned by createKey() in the prior step

MongoClient mongoClient = MongoClients.create(connectionString);
MongoCollection<Document> collection = mongoClient.getDatabase(keyVaultDb).getCollection(keyVaultCollection);

Bson query = Filters.eq("_id", new Binary((byte) 4, Base64.getDecoder().decode(base64KeyId)));
Document doc = collection
    .find(query)
    .first();

System.out.println(doc);

This code example should print a retrieved document that resembles the following:

Document{{
    _id=dad3a063-4f9b-48f8-bf4e-7ca9d323fd1c,
    keyMaterial=org.bson.types.Binary@40e1535,
    creationDate=Wed Sep 25 22:22:54 EDT 2019,
    updateDate=Wed Sep 25 22:22:54 EDT 2019,
    status=0,
    masterKey=Document{{provider=local}}
}}

View the Extended JSON Representation of the Data Key

While the Document class is the Document type most commonly used to work with query results, we can use the BsonDocument class to view the data key document as extended JSON. Replace the Document assignment code with the following to retrieve and print a BsonDocument:

BsonDocument doc = collection
    .withDocumentClass(BsonDocument.class)
    .find(query)
    .first();

System.out.println(doc);
const connectionString = 'mongodb://localhost:27017/';
const keyVaultDb = 'encryption';
const keyVaultCollection = '__keyVault';
const base64KeyId = '3k13WkSZSLy7kwAAP4HDyQ=='; // use the base64 data key id returned by createKey() in the prior step

const client = new MongoClient(connectionString, {
  useNewUrlParser: true,
  useUnifiedTopology: true,
});

async function main() {
  try {
    await client.connect();
    const keyDB = client.db(keyVaultDb);
    const keyColl = keyDB.collection(keyVaultCollection);
    const query = {
      _id: base64KeyId,
    };
    const dataKey = await keyColl.findOne(query);
    console.log(dataKey);
  } finally {
    await client.close();
  }
}
main();

This code example should print a retrieved document that resembles the following:

{
  _id: Binary {
    _bsontype: 'Binary',
    sub_type: 4,
    position: 16,
    buffer: <Buffer 68 ca d2 10 16 5d 45 bf 9d 1d 44 d4 91 a6 92 44>
  },
  keyMaterial: Binary {
    _bsontype: 'Binary',
    sub_type: 0,
    position: 160,
    buffer: <Buffer f1 4a 9f bd aa ac c9 89 e9 b3 da 48 72 8e a8 62 97 2a 4a a0 d2 d4 2d a8 f0 74 9c 16 4d 2c 95 34 19 22 05 05 84 0e 41 42 12 1e e3 b5 f0 b1 c5 a8 37 b8 ... 110 more bytes>
  },
  creationDate: 2019-09-25T22:22:54.017Z,
  updateDate: 2019-09-25T22:22:54.017Z,
  status: 0,
  masterKey: { provider: 'local' }
}

This retrieved document contains the following data:

  • Data encryption key UUID.
  • Data encryption key, in encrypted form.
  • KMS provider information for the master key.
  • Other metadata such as creation and last modified date.

C. Define a JSON Schema

In this section, MedcoMD engineers configure the fields that the client automatically encrypts and decrypts using JSON Schema. JSON Schema is a vocabulary that allows you to annotate and validate JSON documents. MongoDB drivers use an extended version of the JSON Schema standard to configure automatic client-side encryption and decryption of specific fields of the documents in a collection. The MongoDB CSFLE extended JSON Schema standard requires the following information:

  • The encryption algorithm to use when encrypting each field (Deterministic Encryption or Random Encryption)
  • One or more data encryption keys encrypted with the CSFLE master key
  • The BSON Type of each field (only required for deterministically encrypted fields)

CSFLE JSON Schema Does Not Support Document Validation

MongoDB drivers use JSON Schema syntax to specify encrypted fields and only support field-level encryption-specific keywords documented in Automatic Encryption JSON Schema Syntax. Any other document validation instances will cause the client to throw an error.

Server-side JSON Schema

You can prevent clients that are not configured with the appropriate client-side JSON Schema from writing unencrypted data to a field by using server-side JSON Schema. The server-side JSON Schema provides only supplemental enforcement of the client-side JSON Schema. For more details on server-side document validation implementation, see Enforce Field Level Encryption Schema.

The MedcoMD engineers receive specific requirements for the fields of data and their encryption strategies. The following table illustrates the data model of the Medco Management System.

Field type Encryption Algorithm BSON Type
Name Non-Encrypted String
SSN Deterministic Int
Blood Type Random String
Medical Records Random Array
Insurance: Policy Number Deterministic Int (embedded inside insurance object)
Insurance: Provider Non-Encrypted String (embedded inside insurance object)

Data Encryption Key

The MedcoMD engineers created a single data key to use when encrypting all fields in the data model. To configure this, they specify the encryptMetadata key at the root level of the JSON Schema. As a result, all encrypted fields defined in the properties field of the schema will inherit this encryption key unless specifically overwritten.

{
    "bsonType" : "object",
    "encryptMetadata" : {
        "keyId" : // copy and paste your keyID generated here
    },
    "properties": {
        // copy and paste your field schemas here
    }
}

MedcoMD engineers create JSON objects for each field and append them to the properties map.

SSN

The ssn field represents the patient’s social security number. This field is sensitive and should be encrypted. MedcoMD engineers decide upon deterministic encryption based on the following properties:

  • Queryable
  • High cardinality
"ssn": {
    "encrypt": {
        "bsonType": "int",
        "algorithm": "AEAD_AES_256_CBC_HMAC_SHA_512-Deterministic"
    }
}

Blood Type

The bloodType field represents the patient’s blood type. This field is sensitive and should be encrypted. MedcoMD engineers decide upon random encryption based on the following properties:

  • No plans to query
  • Low cardinality
"bloodType": {
    "encrypt": {
        "bsonType": "string",
        "algorithm": "AEAD_AES_256_CBC_HMAC_SHA_512-Random"
    }
}

Medical Records

The medicalRecords field is an array that contains a set of medical record documents. Each medical record document represents a separate visit and specifies information about the patient at that that time, such as their blood pressure, weight, and heart rate. This field is sensitive and should be encrypted. MedcoMD engineers decide upon random encryption based on the following properties:

  • Array fields must use random encryption with CSFLE to enable auto-encryption
"medicalRecords": {
    "encrypt": {
        "bsonType": "array",
        "algorithm": "AEAD_AES_256_CBC_HMAC_SHA_512-Random"
    }
}

Insurance Policy Number

The insurance.policyNumber field is embedded inside the insurance field and represents the patient’s policy number. This policy number is a distinct and sensitive field. MedcoMD engineers decide upon deterministic encryption based on the following properties:

  • Queryable
  • High cardinality
"insurance": {
    "bsonType": "object",
    "properties": {
        "policyNumber": {
            "encrypt": {
                "bsonType": "int",
                "algorithm": "AEAD_AES_256_CBC_HMAC_SHA_512-Deterministic"
            }
        }
    }
}

Recap

MedcoMD engineers created a JSON Schema that satisfies their requirements of making sensitive data queryable and secure. View the full JSON Schema for the Medco Medical Management System.

D. Create the MongoDB Client

The MedcoMD engineers now have the necessary encyption keys and JSON Schema configuration to create a CSFLE-enabled client. They use the following procedure to configure and instantiate the MongoDB client:

1

Specify the Key Vault Collection Namespace

The key vault collection contains the data key that the client uses to encrypt and decrypt fields. MedcoMD uses the collection encryption.__keyVault as the key vault.

key_vault_namespace = "encryption.__keyVault"
String keyVaultNamespace = "encryption.__keyVault";
const keyVaultNamespace = 'encryption.__keyVault';
2

Specify the Local Master Encryption Key

The client expects a key management system to store and provide the application’s master encryption key. For now, MedcoMD only has a local master key, so they use the local KMS provider and specify the key inline.

kms_providers = {
  "local": {
    "key": local_master_key
  }
}
Map<String, Object> keyMap = new HashMap<String, Object>();
keyMap.put("key", localMasterKey);

Map<String, Map<String, Object>> kmsProviders = new HashMap<String, Map<String, Object>>();
kmsProviders.put("local", keyMap);
const kmsProviders = {
  local: {
    key: localMasterKey,
  }
}
3

Map the JSON Schema to the Patients Collection

The MedcoMD engineers assign their schema to a variable. The JSON Schema that MedcoMD defined doesn’t explicitly specify the collection to which it applies. To assign the schema, they map it to the medicalRecords.patients collection namespace:

patient_schema = {
  "medicalRecords.patients": json_schema
}
HashMap<String, BsonDocument> schemaMap = new HashMap<String, BsonDocument>();
schemaMap.put("medicalRecords.patients", BsonDocument.parse(jsonSchema));
const patientSchema = {
  'medicalRecords.patients': jsonSchema,
}
4

Specify the Location of the Encryption Binary

MongoDB drivers use the mongocryptd binary to perform client-side encryption. For automatic encryption, the client manages the mongocryptd process. To configure the client to find the binary, Specify the path to the binary using the following configuration:

extra_options = {
   'mongocryptd_spawn_path': '/usr/local/bin/mongocryptd'
}

Encryption Binary Daemon

If the mongocryptd daemon is already running, you can configure the client to skip starting it by passing the following option:

 extra_options['mongocryptd_bypass_spawn'] = True
Map<String, Object> extraOptions = new HashMap<String, Object>();
extraOptions.put("mongocryptdSpawnPath", "/usr/local/bin/mongocryptd");

Encryption Binary Daemon

If the mongocryptd daemon is already running, you can configure the client to skip starting it by passing the following option:

extraOptions.put("mongocryptdBypassSpawn", true);
const extraOptions = {
  mongocryptdSpawnPath: '/usr/local/bin/mongocryptd',
}

Encryption Binary Daemon

If the mongocryptd daemon is already running, you can configure the client to skip starting it by passing the following option:

 extraOptions.mongocryptdBypassSpawn = true;
5

Create the MongoClient

To create the CSFLE-enabled client, MedcoMD instantiates a standard MongoDB client object with the additional automatic encryption settings:

fle_opts = AutoEncryptionOpts(
   kms_providers,
   key_vault_namespace,
   schema_map=patient_schema,
   **extra_options
)
client = MongoClient(connection_string, auto_encryption_opts=fle_opts)
MongoClientSettings clientSettings = MongoClientSettings.builder()
    .applyConnectionString(new ConnectionString("mongodb://localhost:27017"))
    .autoEncryptionSettings(AutoEncryptionSettings.builder()
        .keyVaultNamespace(keyVaultNamespace)
        .kmsProviders(kmsProviders)
        .schemaMap(schemaMap)
        .extraOptions(extraOptions)
        .build())
    .build();

MongoClient mongoClient = MongoClients.create(clientSettings);
const secureClient = new MongoClient(connectionString, {
    useNewUrlParser: true,
    useUnifiedTopology: true,
    monitorCommands: true,
    autoEncryption: {
      keyVaultNamespace,
      kmsProviders,
      schemaMap: patientSchema,
      extraOptions: extraOptions,
    }
});

E. Perform Encrypted Read/Write Operations

The MedcoMD engineers now have a CSFLE-enabled client and can test that the client can perform queries that meet the requirements. Doctors should be able to read and write to all fields, and receptionists should only be allowed to read and write to non-sensitive fields.

Insert a Document with Encrypted Fields

MedcoMD engineers write a function to create a new patient record:

def insert_patient(collection, name, ssn, blood_type, medical_records, policy_number, provider):
  insurance = {
    'policyNumber': policy_number,
    'provider': provider
  }
  doc = {
      'name': name,
      'ssn': ssn,
      'bloodType': blood_type,
      'medicalRecords': medical_records,
      'insurance': insurance
  }
  collection.insert_one(doc)
public static void insertPatient(
    MongoCollection collection,
    String name,
    int ssn,
    String bloodType,
    ArrayList<Document> medicalRecords,
    int policyNumber,
    String provider
) {

    Document insurance = new Document()
        .append("policyNumber", policyNumber)
        .append("provider", provider);

    Document patient = new Document()
        .append("name", name)
        .append("ssn", ssn)
        .append("bloodType", bloodType)
        .append("medicalRecords", medicalRecords)
        .append("insurance", insurance);

    collection.insertOne(patient);
}
async function insertPatient(collection, name, ssn, bloodType, medicalRecords, policyNumber, provider) {
  try {
    const writeResult = await collection.insertOne({
      name,
      ssn,
      bloodType,
      medicalRecords,
      insurance: {
        policyNumber,
        provider
      }
    });
  } catch (writeError) {
    console.error('writeError occurred:', writeError);
  }
}

When a CSFLE-enabled client inserts a new patient record into the Medical Care Management System, it automatically encrypts the fields specified in the JSON Schema. This operation creates a document similar to the following:

{
    "_id": "5d7a7bbe6d58fd263b6d7315",
    "name": "Jon Doe",
    "ssn": "Ac+ZbPM+sk7gl7CJCcIzlRAQUJ+uo/0WhqX+KbTNdhqCszHucqXNiwqEUjkGlh7gK8pm2JhIs/P3//nkVP0dWu8pSs6TJnpfUwRjPfnI0TURzQ==",
    "bloodType": "As+ZbPM+sk7gl7CJCcIzlRACESwHCTCtK/lQV9kF6/LRoL3mh59gzBVA42vGBVfLIycYWpfAy7ZCi2eRGEgMX5CrGl259Wfu6Zf/ELBVqQDnyQ==",
    "medicalRecords": "As+ZbPM+sk7gl7CJCcIzlRAEFt249toVYOlvlC/79cAtQ5jvE/ukF1ZLxRZn1g0zBBtPnf6L0AFTKMVdNJnjMGPMTszYU58qRE9uMvCU05DVHYl8DJnbtGXXFRLJ7ElQOc=",
    "insurance": {
        "provider": "MaestCare",
        "policyNumber": "Ac+ZbPM+sk7gl7CJCcIzlRAQm7kFhN1hy3l7Wt3BSpBMbvVSuiaDsf3UPF9bvJLTEcC+Ka+3kZI4SVZinj4tyc5uDYeyh6+7phpKrQo4CHWyg=="
    }
}

Note

Clients that do not have CSFLE configured will insert unencrypted data. We recommend using server-side schema validation to enforce encrypted writes for fields that should be encrypted.

Query for Documents on a Deterministically Encrypted Field

Queries on encrypted fields can be done using traditional MongoDB driver methods as well. When a query is made using a client configured to use CSFLE, such as when a doctor in the MedcoMD management system searches for a patient with their SSN, the patient’s data is returned unencrypted:

{
    "_id": "5d6ecdce70401f03b27448fc",
    "name": "Jon Doe",
    "ssn": 241014209,
    "bloodType": "AB+",
    "medicalRecords": [
        {
            "weight": 180,
            "bloodPressure": "120/80"
        }
    ],
    "insurance": {
        "provider": "MaestCare",
        "policyNumber": 123142
    }
}

Note

For queries using a client that is not configured to use CSFLE, such as when receptionists in the MedcoMD management system search for a patient with their ssn, a null value is returned. A client without CSFLE configured cannot query on a sensitive field.

Query for Documents on a Randomly Encrypted Field

Warning

You cannot directly query for documents on a randomly encrypted field, however you can use another field to find the document that contains an approximation of the randomly encrypted field data.

MedcoMD engineers determined that the fields they randomly encrypted would not be used to find patients records. Had this been required, for example, if the patient’s ssn was randomly encrypted, MedcoMD engineers could have included another plaintext field called last4ssn that contains the last 4 digits of the ssn field. They could then query on this field as a proxy for the ssn.

 {
     "_id": "5d6ecdce70401f03b27448fc",
     "name": "Jon Doe",
     "ssn": 241014209,
     "last4ssn": 4209,
     "bloodType": "AB+",
     "medicalRecords": [
         {
             "weight": 180,
             "bloodPressure": "120/80"
         }
     ],
     "insurance": {
         "provider": "MaestCare",
         "policyNumber": 123142
     }
 }

F. Convert to a Remote Master Key

MedcoMD is confident that they have set up their application correctly to use CSFLE. Now, they want to take the app to a production-ready state. They used a locally-managed master key in development but for production they need to use a remote Key Management Service.

MedcoMD converts their application to use AWS KMS with the following procedure:

1

Create an AWS IAM User

Create a new programmatic IAM user to use in CSFLE-enabled clients. The user will encrypt and decrypt the remote master key and must have full List and Read permissions for the KMS service.

Client IAM User Credentials

The CSFLE-enabled client takes the IAM User’s Access Key ID and Secret Access Key as configuration values. Note these down for later when we reconfigure the client.

2

Create the Master Key

In AWS KMS, generate a new master key. The key’s name and description don’t affect the functionality of CSFLE but should describe that it’s for the CSFLE-enabled client.

In the Usage Permissions step of the key generation process, select the newly created IAM User with full KMS List and Read permissions. This allows the user to encrypt and decrypt the new master key.

Important

The new client IAM User should not have administrative permissions for the master key.

3

Specify the AWS KMS Provider Credentials

Unlike the local KMS provider, the AWS KMS provider does not accept the master key directly from the client configuration code. Instead, it accepts the Access Key ID and Secret Access Key of the IAM user with permission to encrypt and decrypt the master key.

Update the KMS Provider configuration in CSFLE-enabled client creation code:

kms_providers = {
    "aws": {
        "accessKeyId": "<IAM User Access Key ID>",
        "secretAccessKey": "<IAM User Secret Access Key>"
    }
}
BsonString masterKeyRegion = new BsonString("<Master Key AWS Region>"); // e.g. "us-east-2"
BsonString awsAccessKeyId = new BsonString("<IAM User Access Key ID>");
BsonString awsSecretAccessKey = new BsonString("<IAM User Secret Access Key>");
Map<String, Map<String, Object>> kmsProviders = new HashMap<String, Map<String, Object>>();
Map<String, Object> providerDetails = new HashMap<String, Object>();

providerDetails.put("accessKeyId", awsAccessKeyId);
providerDetails.put("secretAccessKey", awsSecretAccessKey);
providerDetails.put("region", masterKeyRegion);

kmsProviders.put("aws", providerDetails);
kmsProviders = {
  aws: {
    accessKeyId: '<IAM User Access Key ID>',
    secretAccessKey: '<IAM User Secret Access Key>',
  }
}
4

Create a New Data Key

The development data key was generated from a local master key, so you need to generate a new data key from the remote master key. To generate the key from an AWS KMS master key, you will need to know the key’s AWS region and Amazon Resource Number (ARN).

Once you have the required information, run the following code to generate the new data key:

import pymongo
from pymongo import MongoClient
from pymongo.encryption_options import AutoEncryptionOpts
from bson.binary import STANDARD
from bson.codec_options import CodecOptions

connection_string = "mongodb://localhost:27017"
key_vault_namespace = "encryption.__keyVault"

fle_opts = AutoEncryptionOpts(
   kms_providers, # pass in the kms_providers from the previous step
   key_vault_namespace
)

client_encryption = pymongo.encryption.ClientEncryption(
   {
     "aws": {
       "accessKeyId": "<IAM User Access Key ID>",
       "secretAccessKey": "<IAM User Secret Access Key>"
     }
   },
   key_vault_namespace,
   client,
   CodecOptions(uuid_representation=STANDARD)
)
data_key_id = client_encryption.create_data_key("aws")
ClientEncryption clientEncryption = ClientEncryptions.create(ClientEncryptionSettings.builder()
    .keyVaultMongoClientSettings(MongoClientSettings.builder()
        .applyConnectionString(new ConnectionString("mongodb://localhost:27017"))
        .build())
    .keyVaultNamespace(keyVaultNamespace)
    .kmsProviders(kmsProviders)
    .build());

BsonString masterKeyRegion = new BsonString("<Master Key AWS Region>"); // e.g. "us-east-2"
BsonString masterKeyArn = new BsonString("<Master Key ARN>"); // e.g. "arn:aws:kms:us-east-2:111122223333:alias/test-key"
DataKeyOptions dataKeyOptions = new DataKeyOptions().masterKey(
    new BsonDocument()
        .append("region", masterKeyRegion)
        .append("key", masterKeyArn));

BsonBinary dataKeyId = clientEncryption.createDataKey("aws", dataKeyOptions);
String base64DataKeyId = Base64.getEncoder().encodeToString(dataKeyId.getData());

System.out.println("DataKeyId [base64]: " + base64DataKeyId);
const encryption = new ClientEncryption(client, {
    keyVaultNamespace,
    kmsProviders
});
const key = await encryption.createDataKey('aws', {
   masterKey: {
     key: '<Master Key ARN>', // e.g. 'arn:aws:kms:us-east-2:111122223333:alias/test-key'
     region: '<Master Key AWS Region>', // e.g. 'us-east-1'
   }
});

const base64DataKeyId = key.toString('base64');
console.log('DataKeyId [base64]: ', base64DataKeyId);
5

Update the JSON Schema

The development JSON Schema directly referenced the development data key id, so you need to generate a new JSON Schema. To do so, repeat the Define a JSON Schema step with the new data key that was created with the remote master key.

Recap

After following the procedure, MedcoMD engineers have converted the application to use a remote master key. They used the master key to generate a new data key and updated the JSON schema with that data key.

The MedcoMD engineers are now confident that they have a production-ready, CSFLE-enabled client.

Summary

MedcoMD wanted to develop a system that securely stores sensitive medical records for their patients. They also wanted strong data access and security guarantees that do not rely on individual users. After researching the available options, MedcoMD determined that MongoDB Client-Side Field Level Encryption satisfies their requirements and decided to implement it in their application. To implement CSFLE they:

1. Created a Locally-Managed Master Encryption Key

A locally-managed master key allowed MedcoMD to rapidly develop the client application without external dependencies and avoid accidentally leaking sensitive production credentials.

2. Generated an Encrypted Data Key with the Master Key

CSFLE uses envelope encryption, so they generated a data key that encrypts and decrypts each field and then encrypted the data key using a master key. This allows MedcoMD to store the encrypted data key in MongoDB so that it is shared with all clients while preventing access to clients that don’t have access to the master key.

3. Created a JSON Schema

CSFLE can automatically encrypt and decrypt fields based on a provided JSON Schema that specifies which fields to encrypt and how to encrypt them.

4. Tested and Validated Queries with the CSFLE Client

MedcoMD engineers tested their CSFLE implementation by inserting and querying documents with encrypted fields. They then validated that clients without CSFLE enabled could not read the encrypted data.

5. Took the Client to Production

MedcoMD converted the application to use a remote a master key instead of the locally-managed master key. They generated a new data key from the remote master key and used it to update the JSON Schema. Once they had the updated schema, they were ready to go to production.

For more information on client-side field level encryption in MongoDB, check out the reference docs in the server manual:

For additional information on CSFLE, see the official Java driver documentation

For additional information on CSFLE, see the official Node.js driver documentation