Client-Side Field Level Encryption Guide¶
Who Is This Guide For?¶
This guide shows you how to implement automatic Client-Side Field Level Encryption (CSFLE) using supported MongoDB drivers and is intended for full-stack developers. The guide presents the following information in the context of a real-world scenario:
- Introduction: How Client-Side Field Level Encryption works
- Comparison of Security Features: Reasons to choose this security feature
- Impelementation: How to implement Client-Side Field Level Encryption with the MongoDB driver
For a runnable example of all the functionality demonstrated in this guide, see the Download Example Project section.
Once you complete the steps in this guide, you should have:
- an understanding of how client-side field level encryption works and in what situations it is practical
- a working client application that demonstrates automatic CSFLE
- resources on how to move the sample client application to production
Introduction¶
Applications frequently use and store sensitive data such as confidential personal details, payment information, or proprietary data. In some jurisdictions, this type of data is subject to governance, privacy, and security compliance mandates. Unauthorized access of sensitive data or a failure to comply with a mandate often results in significant reputation damage and financial penalties. Therefore, it is important to keep sensitive data secure.
MongoDB offers several methods that protect your data from unauthorized access including:
Another MongoDB feature that prevents unauthorized access of data is Client-Side Field Level Encryption (CSFLE). This feature allows a developer to selectively encrypt individual fields of a document on the client-side before it is sent to the server. This keeps the encrypted data private from the providers hosting the database as well as any user that has direct access to the database.
This guide provides steps for setup and implementation of CSFLE with a practical example.
Automatic Client-Side Field Level Encryption is available starting in MongoDB 4.2 Enterprise only.
Scenario¶
In this scenario, we secure sensitive data on a Medical Care Management System which stores patients' personal information, insurance information, and medical records for a fictional company, MedcoMD. None of the patient data is public, and certain data such as their social security number (SSN, a US government-issued id number), insurance policy number, and vital sign measurements are particularly sensitive and subject to privacy compliance. It is important for the company and the patient that the data is kept private and secure.
MedcoMD needs this system to satisfy the following use cases:
- Doctors use the system to access Patients' medical records, insurance information, and add new vital sign measurements.
- Receptionists use the system to verify the Patients' identity, using a combination of their contact information and the last four digits of their Social Security Number (SSN).
- Receptionists can view a Patient's insurance policy provider, but not their policy number.
- Receptionists cannot access a Patient's medical records.
MedcoMD is also concerned with disclosure of sensitive data through any of the following methods:
- Accidental disclosure of data on the Receptionist's publicly-viewable screen.
- Direct access to the database by a superuser such as a database administrator.
- Capture of data over an insecure network.
- Access to the data by reading a server's memory.
- Access to the on-disk data by reading database or backup files.
What can MedcoMD do to balance the functionality and access restrictions of their Medical Care Management System?
Comparison of Security Features¶
The MedcoMD engineers review the Medical Care Management System specification and research the proper solution for limiting access to sensitive data.
The first MongoDB security feature they evaluated was Role-Based Access Control which allows administrators to grant and restrict collection-level permissions for users. With the appropriate role definition and assignment, this solution prevents accidental disclosure of data and access. However, it does not prevent capture of the data over an insecure network, direct access of data by a superuser, access to data by reading the server's memory, or access to on-disk data by reading the database or backup files.
The next MongoDB security features they evaluated were Encryption at Rest which encrypts the database files on disk and Transport Encryption using TLS/SSL which encrypts data over the network. When applied together, these two features prevent access to on-disk database files as well as capture of the data on the network, respectively. When combined with Role-Based Access Control, these three security features offer near-comprehensive security coverage of the sensitive data, but lack a mechanism to prevent the data from being read from the server's memory.
Finally, the MedcoMD engineers discovered a feature that independently satisfies all the security criteria. Client-side Field Level Encryption allows the engineers to specify the fields of a document that should be kept encrypted. Sensitive data is transparently encrypted/decrypted by the client and only communicated to and from the server in encrypted form. This mechanism keeps the specified data fields secure in encrypted form on both the server and the network. While all clients have access to the non-sensitive data fields, only appropriately-configured CSFLE clients are able to read and write the sensitive data fields.
The following diagram is a list of MongoDB security features offered and the potential security vulnerabilities that they address:

MedcoMD will provide Receptionists with a client that is not configured to access data encrypted with CSFLE. This will prevent them from viewing the sensitive fields and accidentally leaving them displayed on-screen in a public area. MedcoMD will provide Doctors with a client with CSFLE enabled which will allow them to access the sensitive data fields in the privacy of their own office.
Equipped with CSFLE, MedcoMD can keep their sensitive data secure and compliant to data privacy regulations with MongoDB.
Implementation¶
This section explains the following configuration and implementation details of CSFLE:
- Software required to run your client and server in your local development environment.
- Creation and validation of the encryption keys.
- Configuration of the client for automatic field-level encryption.
- Queries, reads, and writes of encrypted fields.
Requirements¶
- MongoDB Server 4.2 Enterprise
- For installation instructions, refer to the Enterprise Edition Installation Tutorials.
- MongoDB Driver Compatible with CSFLE
- For a list of drivers that support CSFLE, refer to the driver compatibility table.
- File System Permissions
- The client application or a privileged user needs permissions to start the mongocryptd process on the host.
- Additional Dependencies
- Additional dependencies for specific language drivers are required to use CSFLE or run through examples in this guide. To see the list, select the appropriate driver tab below.
Dependency Name | Description |
---|---|
pymongocrypt | Python wrapper for the libmongocrypt encryption library. |
A. Create a Master Key¶
MongoDB Client-Side Field Level Encryption (CSFLE) uses an encryption strategy called envelope encryption in which keys used to encrypt/decrypt data (called data encryption keys) are encrypted with another key (called the master key). For more information on the features of envelope encryption and key management concepts, see AWS Key Management Service Concepts.
In this step, we create and store the master key, used by the MongoDB driver to encrypt data encryption keys, in the Local Key Provider which is the filesystem in our local development environment. We refer to this key as the "locally-managed master key" in this guide.
The following diagram shows how the master key is created and stored:

The data encryption keys, generated and used by the MongoDB driver to encrypt and decrypt document fields, are stored in a key vault collection in the same MongoDB replica set as the encrypted data.
The Local Key Provider is an insecure method of storage and is therefore not recommended if you plan to use CSFLE in production. Instead, you should configure a master key in a Key Management System (KMS) which stores and decrypts your data encryption keys remotely.
To learn how to use a KMS in your CSFLE implementation, read the Client-Side Field Level Encryption: Use a KMS to Store the Master Key guide.
To begin development, MedcoMD engineers generate a master key and save it to a file with the fully runnable code below:
The following script generates a 96-byte locally-managed master key and
saves it to a file called master-key.txt
in the directory
from which the script is executed.
import os path = "master-key.txt" file_bytes = os.urandom(96) with open(path, "wb") as f: f.write(file_bytes)
B. Create a Data Encryption Key¶
In this section, we generate a data encryption key. The MongoDB driver stores the key in a key vault collection where CSFLE-enabled clients can access the key for automatic encryption and decryption.
The following diagram shows how the data encryption keys are created and stored:

The client requires the following configuration values to generate a new data encryption key:
- The locally-managed master key.
- A MongoDB connection string that authenticates on a running server.
- The key vault namespace (database and collection).
- A unique index for the key vault collection on the
keyAltNames
field
Follow the steps below to generate a single data encryption key from the locally-managed master key.
C. Specify Encrypted Fields Using JSON Schema¶
MongoDB drivers use an extended version of the JSON Schema standard to configure automatic client-side encryption and decryption of specific fields of the documents in a collection.
Automatic CSFLE requires MongoDB Enterprise or MongoDB Atlas.
The MongoDB CSFLE extended JSON Schema standard requires the following information:
- The encryption algorithm to use when encrypting each field (Deterministic Encryption or Random Encryption)
- One or more data encryption keys encrypted with the CSFLE master key
- The BSON Type of each field (only required for deterministically encrypted fields)
MongoDB drivers use JSON Schema syntax to specify encrypted fields and only support field-level encryption-specific keywords documented in Automatic Encryption JSON Schema Syntax. Any other document validation instances will cause the client to throw an error.
You can prevent clients that are not configured with the appropriate client-side JSON Schema from writing unencrypted data to a field by using server-side JSON Schema. The server-side JSON Schema provides only supplemental enforcement of the client-side JSON Schema. For more details on server-side document validation implementation, see Enforce Field Level Encryption Schema.
The MedcoMD engineers receive specific requirements for the fields of data and their encryption strategies. The following table illustrates the data model of the Medical Care Management System.
Field type | Encryption Algorithm | BSON Type |
---|---|---|
Name | Non-Encrypted | String |
SSN | Deterministic | Int |
Blood Type | Random | String |
Medical Records | Random | Array |
Insurance: Policy Number | Deterministic | Int (embedded inside insurance object) |
Insurance: Provider | Non-Encrypted | String (embedded inside insurance object) |
Data Encryption Key¶
The MedcoMD engineers created a single data key to use when encrypting
all fields in the data model. To configure this, they specify the
encryptMetadata
key at the root level of the JSON Schema. As a result, all encrypted
fields defined in the properties
field of the schema will inherit this
encryption key unless specifically overwritten.
{ "bsonType" : "object", "encryptMetadata" : { "keyId" : // copy and paste your keyId generated here }, "properties": { // copy and paste your field schemas here } }
MedcoMD engineers create JSON objects for each field and append them to
the properties
map.
SSN¶
The ssn
field represents the patient's social security number. This
field is sensitive and should be encrypted. MedcoMD engineers decide
upon deterministic encryption based on the following properties:
- Queryable
- High cardinality
"ssn": { "encrypt": { "bsonType": "int", "algorithm": "AEAD_AES_256_CBC_HMAC_SHA_512-Deterministic" } }
Blood Type¶
The bloodType
field represents the patient's blood type. This field
is sensitive and should be encrypted. MedcoMD engineers decide upon
random encryption based on the following properties:
- No plans to query
- Low cardinality
"bloodType": { "encrypt": { "bsonType": "string", "algorithm": "AEAD_AES_256_CBC_HMAC_SHA_512-Random" } }
Medical Records¶
The medicalRecords
field is an array that contains a set of medical
record documents. Each medical record document represents a separate
visit and specifies information about the patient at that time,
such as their blood pressure, weight, and heart rate. This field is
sensitive and should be encrypted. MedcoMD engineers decide upon random
encryption based on the following properties:
- Array fields must use random encryption with CSFLE to enable auto-encryption
"medicalRecords": { "encrypt": { "bsonType": "array", "algorithm": "AEAD_AES_256_CBC_HMAC_SHA_512-Random" } }
Insurance Policy Number¶
The insurance.policyNumber
field is embedded inside the
insurance
field and represents the patient's policy number. This
policy number is a distinct and sensitive field. MedcoMD engineers
decide upon deterministic encryption based on the following properties:
- Queryable
- High cardinality
"insurance": { "bsonType": "object", "properties": { "policyNumber": { "encrypt": { "bsonType": "int", "algorithm": "AEAD_AES_256_CBC_HMAC_SHA_512-Deterministic" } } } }
Recap¶
MedcoMD engineers created a JSON Schema that satisfies their requirements of making sensitive data queryable and secure. View the full JSON Schema for the Medical Care Management System.
View the complete runnable helper code in Python.
D. Create the MongoDB Client¶
The MedcoMD engineers now have the JSON Schema and encryption keys necessary to create a CSFLE-enabled MongoDB client.
They build the client to communicate with a MongoDB cluster and perform actions such as securely reading and writing documents with encrypted fields.
About the Mongocryptd Application¶
The MongoDB client communicates with a separate encryption application called
mongocryptd
which automates the client-side field level encryption.
This application is installed with MongoDB Enterprise Server (version 4.2
and later).
When we create a CSFLE-enabled MongoDB client, the mongocryptd
process is automatically started by default, and handles the following
responsibilities:
- Validates the encryption instructions defined in the JSON Schema and flags the referenced fields for encryption in read and write operations.
- Prevents unsupported operations from being executed on encrypted fields.
When the mongocryptd
process is started with the client driver, you
can provide configurable parameters including:
Name | Description | |
---|---|---|
port | Listening port. Specify this value as follows: Example
Default: 27020 | |
idleShutdownTimeoutSecs | Number of idle seconds in which the mongocryptd process should wait before exiting.Specify this value as follows: Example
Default: 60 |
If a mongocryptd
process is already running on the port specified
by the driver, the driver may log a warning and continue to operate
without spawning a new process. Any settings specified by the driver
only apply once the existing process exits and a new encrypted client
attempts to connect.
For additional information on mongocryptd
, refer to the
mongocryptd manual page.
The MedcoMD engineers use the following procedure to configure and instantiate the MongoDB client:
Specify the Key Vault Collection Namespace¶
The key vault collection contains the data key that the client uses to
encrypt and decrypt fields. MedcoMD uses the collection
encryption.__keyVault
as the key vault in the following
code snippet.
key_vault_namespace = "encryption.__keyVault"
Specify the Local Master Encryption Key¶
The client expects a key management system to store and provide the
application's master encryption key. For now, MedcoMD only has a local
master key, so they use the local
KMS provider and specify the key
inline with the following code snippet.
kms_providers = { "local": { "key": local_master_key } }
Map the JSON Schema to the Patients Collection¶
The MedcoMD engineers assign their schema to a variable. The JSON Schema
that MedcoMD defined doesn't explicitly specify the collection to which it
applies. To assign the schema, they map it to the medicalRecords.patients
collection namespace in the following code snippet:
patient_schema = { "medicalRecords.patients": json_schema }
Specify the Location of the Encryption Binary¶
MongoDB drivers communicate with the mongocryptd
encryption binary to
perform automatic client-side field level encryption. The mongocryptd
process performs the following:
- Validates the encryption instructions defined in the JSON Schema and flags the referenced fields for encryption in read and write operations.
- Prevents unsupported operations from being executed on encrypted fields.
Configure the client to spawn the mongocryptd
process by specifying the
path to the binary using the following configuration options:
extra_options = { 'mongocryptd_spawn_path': '/usr/local/bin/mongocryptd' }
If the mongocryptd
daemon is already running, you can
configure the client to skip starting it by passing the
following option:
extra_options['mongocryptd_bypass_spawn'] = True
Create the MongoClient¶
To create the CSFLE-enabled client, MedcoMD instantiates a standard MongoDB client object with the additional automatic encryption settings with the following code snippet:
fle_opts = AutoEncryptionOpts( kms_providers, key_vault_namespace, schema_map=patient_schema, **extra_options ) client = MongoClient(connection_string, auto_encryption_opts=fle_opts)
E. Perform Encrypted Read/Write Operations¶
The MedcoMD engineers now have a CSFLE-enabled client and can test that the client can perform queries that meet the requirements. Doctors should be able to read and write to all fields, and receptionists should only be allowed to read and write to non-sensitive fields.
Insert a Document with Encrypted Fields¶
The following diagram shows the steps taken by the client application and driver to perform a write of field-level encrypted data:

MedcoMD engineers write a function to create a new patient record with the following code snippet:
def insert_patient(collection, name, ssn, blood_type, medical_records, policy_number, provider): insurance = { 'policyNumber': policy_number, 'provider': provider } doc = { 'name': name, 'ssn': ssn, 'bloodType': blood_type, 'medicalRecords': medical_records, 'insurance': insurance } collection.insert_one(doc)
When a CSFLE-enabled client inserts a new patient record into the Medical Care Management System, it automatically encrypts the fields specified in the JSON Schema. This operation creates a document similar to the following:
{ "_id": "5d7a7bbe6d58fd263b6d7315", "name": "Jon Doe", "ssn": "Ac+ZbPM+sk7gl7CJCcIzlRAQUJ+uo/0WhqX+KbTNdhqCszHucqXNiwqEUjkGlh7gK8pm2JhIs/P3//nkVP0dWu8pSs6TJnpfUwRjPfnI0TURzQ==", "bloodType": "As+ZbPM+sk7gl7CJCcIzlRACESwHCTCtK/lQV9kF6/LRoL3mh59gzBVA42vGBVfLIycYWpfAy7ZCi2eRGEgMX5CrGl259Wfu6Zf/ELBVqQDnyQ==", "medicalRecords": "As+ZbPM+sk7gl7CJCcIzlRAEFt249toVYOlvlC/79cAtQ5jvE/ukF1ZLxRZn1g0zBBtPnf6L0AFTKMVdNJnjMGPMTszYU58qRE9uMvCU05DVHYl8DJnbtGXXFRLJ7ElQOc=", "insurance": { "provider": "MaestCare", "policyNumber": "Ac+ZbPM+sk7gl7CJCcIzlRAQm7kFhN1hy3l7Wt3BSpBMbvVSuiaDsf3UPF9bvJLTEcC+Ka+3kZI4SVZinj4tyc5uDYeyh6+7phpKrQo4CHWyg==" } }
Clients that do not have CSFLE configured will insert unencrypted data. We recommend using server-side schema validation to enforce encrypted writes for fields that should be encrypted.
Query for Documents on a Deterministically Encrypted Field¶
The following diagram shows the steps taken by the client application and driver to query and decrypt field-level encrypted data:

You can run queries on documents with encrypted fields using standard MongoDB driver methods. When a doctor performs a query in the Medical Care Management System to search for a patient by their SSN, the driver decrypts the patient's data before returning it:
{ "_id": "5d6ecdce70401f03b27448fc", "name": "Jon Doe", "ssn": 241014209, "bloodType": "AB+", "medicalRecords": [ { "weight": 180, "bloodPressure": "120/80" } ], "insurance": { "provider": "MaestCare", "policyNumber": 123142 } }
For queries using a client that is not configured to use CSFLE, such
as when receptionists in the Medical Care Management System search for a
patient with their ssn
, a null value is returned. A client without CSFLE
configured cannot query on a sensitive field.
Query for Documents on a Randomly Encrypted Field¶
You cannot directly query for documents on a randomly encrypted field, however you can use another field to find the document that contains an approximation of the randomly encrypted field data.
MedcoMD engineers determined that the fields they randomly encrypted
would not be used to find patients records. Had this been required, for example,
if the patient's ssn
was randomly encrypted, MedcoMD engineers
could have included another plain-text field called last4ssn
that
contains the last 4 digits of the ssn
field. They could then query
on this field as a proxy for the ssn
.
{ "_id": "5d6ecdce70401f03b27448fc", "name": "Jon Doe", "ssn": 241014209, "last4ssn": 4209, "bloodType": "AB+", "medicalRecords": [ { "weight": 180, "bloodPressure": "120/80" } ], "insurance": { "provider": "MaestCare", "policyNumber": 123142 } }
Summary¶
MedcoMD wanted to develop a system that securely stores sensitive medical records for their patients. They also wanted strong data access and security guarantees that do not rely on individual users. After researching the available options, MedcoMD determined that MongoDB Client-Side Field Level Encryption satisfies their requirements and decided to implement it in their application. To implement CSFLE they:
1. Created a Locally-Managed Master Encryption Key
A locally-managed master key allowed MedcoMD to rapidly develop the client application without external dependencies and avoid accidentally leaking sensitive production credentials.
2. Generated an Encrypted Data Key with the Master Key
CSFLE uses envelope encryption, so they generated a data key that encrypts and decrypts each field and then encrypted the data key using a master key. This allows MedcoMD to store the encrypted data key in MongoDB so that it is shared with all clients while preventing access to clients that don't have access to the master key.
3. Created a JSON Schema
CSFLE can automatically encrypt and decrypt fields based on a provided JSON Schema that specifies which fields to encrypt and how to encrypt them.
4. Tested and Validated Queries with the CSFLE Client
MedcoMD engineers tested their CSFLE implementation by inserting and querying documents with encrypted fields. They then validated that clients without CSFLE enabled could not read the encrypted data.
Additional Information¶
Download Example Project¶
To view and download a runnable example of CSFLE, select your driver below:
GitHub: PyMongo CSFLE runnable example
Move to Production¶
In this guide, we stored the master key in your local filesystem. Since your data encryption keys would be readable by anyone that gains direct access to your master key, we strongly recommend that you use a more secure storage location such as a Key Management System (KMS).
For more information on securing your master key, see our step-by-step guide to integrating with Amazon KMS.
Further Reading¶
For more information on client-side field level encryption in MongoDB, check out the reference docs in the server manual:
- Client-Side Field Level Encryption
- Automatic Encryption JSON Schema Syntax
- Manage Client-Side Encryption Data Keys
For additional information on the MongoDB CSFLE API, see the official PyMongo driver documentation.