How is it that I can read from an Azure Blob Storage and fail to write onto it?

Question

Banging my head up against the wall since I just can't write a parquet file into an Azure Blob Storage. On my Azure Databricks Notebook I basically: 1. read a CSV from the same blob storage as a dataframe and 2. attempt to write the dataframe into the same storage.

I am able to read the CSV, but there is this error as I try to write the parquet file.

Here's the stack trace:

Job aborted due to stage failure: Task 0 in stage 8.0 failed 4 times, most recent failure: Lost task 0.3 in stage 8.0 (TID 20, 10.139.64.5, executor 0): shaded.databricks.org.apache.hadoop.fs.azure.AzureException: java.io.IOException at shaded.databricks.org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.storeEmptyFolder(AzureNativeFileSystemStore.java:1609) ... ... Caused by: com.microsoft.azure.storage.StorageException: The specified resource does not exist.

Here's my python code:

spark.conf.set("fs.azure.sas.my_container.my_storage.blob.core.windows.net", dbutils.secrets.get(scope = "my_scope", key = "my_key"))

read csv

df100 = spark.read.format("csv").option("header", "true").load("wasbs://my_container@my_storage.blob.core.windows.net/folder/revenue.csv")

write parquet

df100.write.parquet('wasbs://my_container@my_storage.blob.core.windows.net/f1/deh.parquet')

How is it that I can read from an Azure Blob Storage and fail to write onto it?

read csv

write parquet

end

Answers (1)

Related Questions