Docs Menu

Apply Schemas

On this page

  • Overview
  • Default Schemas
  • Key Schema
  • Value Schema
  • Schemas For Transformed Documents
  • Specify Schemas
  • Infer a Schema

In this guide, you can learn how to apply schemas to incoming documents in a MongoDB Kafka Connector source connector.

There are two types of schema in Kafka Connect, key schema and value schema. Kafka Connect sends messages to Apache Kafka containing both your value and a key. A key schema enforces a structure for keys in messages sent to Apache Kafka. A value schema enforces a structure for values in messages sent to Apache Kafka.

Important
Note on Terminology

The word "key" has a slightly different meaning in the context of BSON and Apache Kafka. In BSON, a "key" is a unique string identifier for a field in a document.

In Apache Kafka, a "key" is a byte array sent in a message used to determine what partition of a topic to write the message to. Kafka keys can be duplicates of other keys or null.

Specifying schemas in the MongoDB Kafka Connector is optional, and you can specify any of the following combinations of schemas:

  • Only a value schema
  • Only a key schema
  • Both a value and key schema
  • No schemas
Tip
Benefits of Schema

To see a discussion on the benefits of using schemas with Kafka Connect, see this article from Confluent.

If you want to send data through Apache Kafka with a specific data format, such as Apache Avro or JSON Schema, see the Converters guide.

To learn more about keys and values in Apache Kafka, see the official Apache Kafka introduction.

The MongoDB Kafka Connector provides two default schemas:

To learn more about change events, see our guide on change streams.

To learn more about default schemas, see the default schemas here in the MongoDB Kafka Connector source code.

The MongoDB Kafka Connector provides a default key schema for the _id field of change event documents. You should use the default key schema unless you remove the _id field from your change event document using either of the transformations described in this guide here.

If you specify either of these transformations and want to use a key schema for your incoming documents, you must specify a key schema as described in the specify a schema section of this guide.

You can enable the default key schema with the following option:

output.format.key=schema

The MongoDB Kafka Connector provides a default value schema for change event documents. You should use the default value schema unless you transform your change event documents as described in this guide here.

If you specify either of these transformations and want to use a value schema for your incoming documents, you must use one of the mechanisms described in the schemas for transformed documents section of this guide.

You can enable the default value schema with the following option:

output.format.value=schema

There are two ways you can transform your change event documents in a source connector:

  • The publish.full.document.only=true option
  • An aggregation pipeline that modifies the structure of change event documents

If you transform your MongoDB change event documents, you must do the following to apply schemas:

To learn more about the preceding configuration options, see the Change Stream Properties page.

You can specify schemas for incoming documents using Avro schema syntax. Click on the following tabs to see how to specify a schema for document values and keys:

To view an example that demonstrates how to specify a schema, see the Specify a Schema usage example.

To learn more about Avro Schema, see the Data Formats guide.

Important
Converters

If you want to send your data through Apache Kafka with Avro binary encoding, you must use an Avro converter. For more information, see the guide on Converters.

You can have your source connector infer a schema for incoming documents. This option works well for development and for data sources that do not frequently change structure, but for most production deployments we recommend that you specify a schema.

You can have the MongoDB Kafka Connector infer a schema by specifying the following options:

output.format.value=schema
output.schema.infer.value=true
Note
Cannot Infer Key Schema

The MongoDB Kafka Connector does not support key schema inference. If you want to use a key schema and transform your MongoDB change event documents, you must specify a key schema as described in the specify schemas section of this guide.

Give Feedback
MongoDB logo
© 2021 MongoDB, Inc.

About

  • Careers
  • Legal Notices
  • Privacy Notices
  • Security Information
  • Trust Center
© 2021 MongoDB, Inc.