Navigation

MongoDB Kafka Connector Docker Example

This guide provides an end-to-end setup of MongoDB and Kafka Connect to demonstrate the functionality of the MongoDB Kafka Source and Sink Connectors.

In this example, we create the following Kafka Connectors:

Connector Data Source Destination
Confluent Connector: Datagen Avro random generator Kafka topic: pageviews
Sink Connector: mongo-sink Kafka topic: pageviews MongoDB collection: test.pageviews
Source Connector: mongo-source MongoDB collection: test.pageviews Kafka topic: mongo.test.pageviews
  • The Datagen Connector creates random data using the Avro random generator and publishes it to the Kafka topic “pageviews”.
  • The mongo-sink connector reads data from the “pageviews” topic and writes it to MongoDB in the “test.pageviews” collection.
  • The mongo-source connector produces change events for the “test.pageviews” collection and publishes them to the “mongo.test.pageviews” collection.

Requirements

Linux/Unix-based OS

How to Run the Example

Clone the mongo-kafka repository from GitHub:

git clone https://github.com/mongodb/mongo-kafka.git

Change directory to the docker directory

cd mongo-kafka/docker/

Start the shell script, run.sh:

./run.sh

The shell script executes the following sequence of commands:

  1. Run the docker-compose up command

    The docker-compose command installs and starts the following applications in a new docker container:

    • Zookeeper
    • Kafka
    • Confluent Schema Registry
    • Confluent Kafka Connect
    • Confluent Control Center
    • Confluent KSQL Server
    • Kafka Rest Proxy
    • Kafka Topics UI
    • MongoDB replica set (three nodes: mongo1, mongo2, and mongo3)
  2. Wait for MongoDB, Kafka, Kafka Connect to become ready

  3. Register the Confluent Datagen Connector

  4. Register the MongoDB Kafka Sink Connector

  5. Register the MongoDB Kafka Source Connector

Note

You may need to increase the RAM resource limits for Docker if the script fails. Use the docker-compose stop <docker-compose-stop> command to stop any running instances of docker if the script did not complete successfully.

Once the services have been started by the shell script, the Datagen Connector publishes new events to Kafka at short intervals which triggers the following cycle:

  1. The Datagen Connector publishes new events to Kafka
  2. The Sink Connector writes the events into MongoDB
  3. The Source Connector writes the change stream messages back into Kafka

To view the Kafka topics, open the Kafka Control Center at http://localhost:9021/ and navigate to the cluster topics.

  • The pageviews topic should contain documents added by the Datagen Connector that resemble the following:

    {
      "viewtime": {
        "$numberLong": "81"
      },
      "pageid": "Page_1",
      "userid": "User_8"
    }
    
  • The mongo.test.pageviews topic should contain change events that resemble the following:

    {
      "_id": {
        "_data": "<resumeToken>"
      },
      "operationType": "insert",
      "clusterTime": {
        "$timestamp": {
          "t": 1563461814,
          "i": 4
        }
      },
      "fullDocument": {
        "_id": {
          "$oid": "5d3088b6bafa7829964150f3"
        },
        "viewtime": {
          "$numberLong": "81"
        },
        "pageid": "Page_1",
        "userid": "User_8"
      },
      "ns": {
        "db": "test",
        "coll": "pageviews"
      },
      "documentKey": {
        "_id": {
          "$oid": "5d3088b6bafa7829964150f3"
        }
      }
    }
    

Next, explore the collection data in the MongoDB replica set:

  • In your local shell, navigate to the docker directory from which you ran the docker-compose commands and connect to the mongo1 MongoDB instance using the following command:

    docker-compose exec mongo1 /usr/bin/mongo
    
  • If you insert or update a document in the test.pageviews, the Source Connector publishes a change event document to the mongo.test.pageviews Kafka topic.

To stop the docker containers and all the processes running on them, use Ctrl-C in the shell running the script, or the following command:

docker-compose stop

To remove the docker containers and images completely, use the following command:

docker-compose down