CMc
CMc

Reputation: 1

Trying to save an xml string output to an xml file in adls using Azure Synapse Notebooks Pyspark

I am calling an api which sends back an xml string as its response. I am trying to take that xml string and save it as an xml file in ADLS using pyspark in Azure Synapse Notebooks. From there I am then trying to read that xml file and convert it to parquet.

I was able to successfully call the api address and get the xml string as a response, however when trying to write out the file or read an xml file using the below logic I am met with the following error.

df = spark.read.format("com.databricks.spark.xml").options(rowTag="message").load("<adls_file_path>")

error - Py4JJavaError: An error occurred while calling o1744.load. : java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml

Upvotes: 0

Views: 529

Answers (1)

Vamsi Bitra
Vamsi Bitra

Reputation: 2764

The above error mainly happens because of libraries are properly not installed.

Please follow below steps:

enter image description here

enter image description here

enter image description here

Download a jar file click here

Follow below reference for more information:

Upvotes: 0

Related Questions