Read From MongoDB

Use the MongoSpark.load method to create an RDD representing a collection.

The following example loads the collection specified in the SparkConf:

val rdd = MongoSpark.load(sc)

To specify a different collection, database, and other read configuration settings, pass a ReadConfig to MongoSpark.load().

MongoSpark.load() can accept a ReadConfig object which specifies various read configuration settings, such as the collection or the Read Preference.

The following example reads from the spark collection with a secondaryPreferred read preference:

import com.mongodb.spark.config._
val readConfig = ReadConfig(Map("collection" -> "spark", "" -> "secondaryPreferred"), Some(ReadConfig(sc)))
val customRdd = MongoSpark.load(sc, readConfig)

SparkContext has an implicit helper method loadFromMongoDB() to load data from MongoDB.

For example, use the loadFromMongoDB() method without any arguments to load the collection specified in the SparkConf:

sc.loadFromMongoDB() // Uses the SparkConf for configuration

Call loadFromMongoDB() with a ReadConfig object to specify a different MongoDB server address, database and collection. See input configuration settings for available settings:

sc.loadFromMongoDB(ReadConfig(Map("uri" -> "mongodb://"))) // Uses the ReadConfig
