Reputation: 1
When i want to show my spark dataframe after:
dataFrame = spark.read.format("mongodb").option("spark.mongodb.database","testdb").option("spark.mongodb.collection", "collection1").load()
dataFrame.show()
It gives the following error:
Py4JJavaError: An error occurred while calling o83.showString. : com.mongodb.spark.sql.connector.exceptions.MongoSparkException: Partitioning failed. Partitioner calling collStats command failed
But dataFrame.printSchema() gives the result with the schema, i already find out that the collStats is not supported on DocDB, but how can i turn this function off with the mongo-connector for spark
Upvotes: 0
Views: 836
Reputation: 1771
If you are using a view, you should try using a different partitioner than the default one.
partitioner avaible : https://www.mongodb.com/docs/spark-connector/current/batch-mode/batch-read-config/#partitioner-configurations
In my base, i can read view by using PaginateIntoPartitionsPartitioner or SinglePartitionPartitioner
Upvotes: 0