Reputation: 131
I'm running in an Azure Synapse notebook and trying to use PySpark to read a SQL table. It seems to be able to read the table, but when I want to show the results, I get an error indicating that it can't access the temporary directory.
If I specify the temp directory using the "wasbs" schema, I get this error:
External file access failed due to internal error: 'Parameters provided to connect to the Azure storage account are not valid.
If I specify the temp directory with the abfss schema, I get this error:
CREATE EXTERNAL TABLE AS SELECT statement failed as the path name 'abfss://@.dfs.core.windows.net/temp/SQLAnalyticsConnectorStaging/...tbl' could not be used for export. Please ensure that the specified path is a directory which exists or can be created, and that files can be created in that directory.
The container name, account name, and account key are correct, so I'm guessing that I'm not setting the config correctly, but I've tried everything I could think of.
I've also set "hadoop" config by replacing the "fs.azure.account.key" with "spark.hadoop.fs.azure.account.key".
Code examples are below. I think it's successfully accessing the database because I'm able to show the columns using print ("columns", df.columns). I get the error when I try to show the data with print ("head", df.head())
Any help is appreciated.
from pyspark.sql import SparkSession
container = "container_name"
storage_account_name = "storage_account_name"
account_key = "account_key"
appName = "test"
master = "local"
spark = SparkSession.builder \
.appName(appName) \
.master(master) \
.getOrCreate()
spark.conf.set(f"fs.azure.account.key.{storage_account_name}.blob.core.windows.net", account_key)
df = spark.read \
.option(Constants.TEMP_FOLDER, f"wasbs://{container}@{storage_account_name}.blob.core.windows.net/temp") \
.synapsesql("db_name.schema_name..spark_test")
print ("columns", df.columns)
print ("head", df.head())
Upvotes: 0
Views: 633
Reputation: 8402
could not be used for export. Please ensure that the specified path is a directory which exists or can be created, and that files can be created in that directory.
PolyBase is unable to perform the operation, which results in this error.
Causes:
Resolution:
ports 80 and 443
.standard locally redundant storage (Standard-LRS) or standard geo-redundant storage (Standard-GRS)
and configure the account for General Purpose
.Upvotes: 0