Jonathanv
Jonathanv

Reputation: 1

spark mongo-connector can't read from DocumentDB

When i want to show my spark dataframe after:

dataFrame = spark.read.format("mongodb").option("spark.mongodb.database","testdb").option("spark.mongodb.collection", "collection1").load()

dataFrame.show()

It gives the following error:

Py4JJavaError: An error occurred while calling o83.showString. : com.mongodb.spark.sql.connector.exceptions.MongoSparkException: Partitioning failed. Partitioner calling collStats command failed

But dataFrame.printSchema() gives the result with the schema, i already find out that the collStats is not supported on DocDB, but how can i turn this function off with the mongo-connector for spark

Upvotes: 0

Views: 836

Answers (1)

maxime G
maxime G

Reputation: 1771

If you are using a view, you should try using a different partitioner than the default one.

partitioner avaible : https://www.mongodb.com/docs/spark-connector/current/batch-mode/batch-read-config/#partitioner-configurations

In my base, i can read view by using PaginateIntoPartitionsPartitioner or SinglePartitionPartitioner

Upvotes: 0

Related Questions