- Spark Connector R Guide >
- Aggregation
Aggregation¶
Use MongoDB’s aggregation pipeline to apply filtering rules and perform aggregation operations when reading data from MongoDB into Spark.
Consider a collection named fruit
that contains the
following documents:
Add a pipeline
argument to read.df()
from
within the sparkR
shell to specify an aggregation pipeline
to use when creating a DataFrame.
Note
The empty argument (“”) refers to a file to use as a data source. In this case our data source is a MongoDB collection, so the data source argument is empty.
In the sparkR
shell, the operation prints the following output: