Reputation: 1
As part of migrating from Azure Databricks to Azure Synapse Analytics Notebooks, I'm facing the issue explained below.
While reading a CSV file from Azure Datalake Storage Gen 2 and assigning it to a pyspark dataframe using the following command.
df = spark.read.format('csv').option("delimiter", ",").option("multiline", "true").option("quote", '"').option("header", "true").option("escape", "\\").load(csvFilePath)
After processing this file, we need to overwrite it and we use the following command.
df.coalesce(1).write.option("delimiter", ",").csv(csvFilePath, mode = 'overwrite', header = 'true')
What this does is, it deletes the existing file at the path "csvFilePath" and the fails with error, "Py4JJavaError: An error occurred while calling o617.csv."
Things I've noticed:
[Error returned by Synapse Notebook at write command.][1] [1]: https://i.sstatic.net/Obj9q.png
Upvotes: 0
Views: 1034
Reputation: 1683
It's suggestable to perform mounting the data storage. Kindly refer the below documentation.
https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-use-databricks-spark
Upvotes: 0