Connect MongoDB Atlas with Pyspark Inside Microsoft Fabric Notebook

Question

I want to connect MongoDB Atlas with PySpark inside Microsoft Fabric Notebook. Here is my pyspark code base.

from pyspark.sql import SparkSession
mongo_uri = "mongodb+srv://:@cluster1.hju3l.mongodb.net/?retryWrites=true&w=majority&appName=Cluster1"
my_spark = SparkSession \
    .builder \
    .appName("myApp") \
    .config("spark.mongodb.read.connection.uri", mongo_uri) \
    .config("spark.mongodb.write.connection.uri", mongo_uri) \
    .config("spark.jars.packages", "org.mongodb.spark:mongo-spark-connector_2.12:10.2.0") \
    .getOrCreate()
df = spark.read.format("mongodb").option("database", "lead").option("collection", 
"users").load()
df.printSchema()

But when i try to run above code it throwing an below error

Py4JJavaError: An error occurred while calling o6558.load.
: org.apache.spark.SparkClassNotFoundException: [DATA_SOURCE_NOT_FOUND] Failed to find the data source: mongodb. Please find packages at `https://spark.apache.org/third-party-projects.html`.

After searching the cause of this issue it showing that the mongo-spark-connector jar file not found but i have upload the jar file in library section of Microsoft fabric(custom library section) and also installed mongoengine inside public library section.

I have also upload same jar file(mongo-spark-connector_2.12-10.2.0.jar) inside notebook spark environment also. Below is the screenshort.

Connect MongoDB Atlas with Pyspark Inside Microsoft Fabric Notebook

Answers (1)

Related Questions