Mohan
Mohan

Reputation: 81

Spark Data writing in Delta format

Spark Version: 3.2.1 Delta version: 1.2.1 (tried 2.0 version as well)

While I am trying to run the getting started code to try out "delta".

from pyspark.sql import SparkSession
from delta import *
builder = SparkSession.builder.appName("MyApp") \
    .config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension") \
    .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog")

spark = configure_spark_with_delta_pip(builder).getOrCreate()
data = spark.range(0, 5)
data.write.format("delta").save("/tmp/delta-table")

I am getting below error: "name": "Py4JJavaError", "message": "An error occurred while calling o201.showString.\n: org.apache.spark.SparkException: Cannot find catalog plugin class for catalog 'spark_catalog'

Can anyone please help me understand the issue to resolve it? Thanks in Advance.

Upvotes: 3

Views: 3413

Answers (1)

Jonathan
Jonathan

Reputation: 2033

Not sure which environment and mode are you using, but in general you need to add your jar by using the config spark.jars.packages because delta lake jar is not in Spark default jar. For example .config("spark.jars.packages", "io.delta:delta-core_2.12:1.2.0")

Upvotes: 3

Related Questions