Reputation: 191
I have a Databricks cluster running on Azure and want read / write data from Azure Data Lake Storage using SparkR
/ sparklyr
. Therefore I configured the two resources.
Now I have to provide the Spark environment the necessary configurations to authenticate against the Data Lake Storage.
Setting the configs using the PySpark API
works:
spark.conf.set("dfs.adls.oauth2.access.token.provider.type", "ClientCredential")
spark.conf.set("dfs.adls.oauth2.client.id", "****")
spark.conf.set("dfs.adls.oauth2.credential", "****")
spark.conf.set("dfs.adls.oauth2.refresh.url", "https://login.microsoftonline.com/****/oauth2/token")
In the end SparkR
/ sparklyr
should be used. Here I couldn't figure out where to set the spark.conf.set
. I would have guessed something like:
sparkR.session(
sparkConfig = list(spark.driver.memory = "2g",
spark.conf.set("dfs.adls.oauth2.access.token.provider.type", "ClientCredential"),
spark.conf.set("dfs.adls.oauth2.client.id", "****"),
spark.conf.set("dfs.adls.oauth2.credential", "****"),
spark.conf.set("dfs.adls.oauth2.refresh.url", "https://login.microsoftonline.com/****/oauth2/token")
))
Would be awesome if one of the experts using the SparkR
API could help me out here. Thanks!
EDIT: The answer by user10791349 is correct and it works. Another solution is mounting the external data source which is best practice. This is currently only possible using Scala or Python but the mounted data source is afterwards available using the SparkR API.
Upvotes: 2
Views: 1858
Reputation: 46
sparkConfig
should be
named list of Spark configuration to set on worker nodes.
So the right format is
sparkR.session(
... # All other options
sparkConfig = list(
spark.driver.memory = "2g",
dfs.adls.oauth2.access.token.provider.type = "ClientCredential",
dfs.adls.oauth2.client.id = "****",
dfs.adls.oauth2.credential = "****",
dfs.adls.oauth2.refresh.url ="https://login.microsoftonline.com/****/oauth2/token"
)
)
Remember that the many configuration will be recognized only if there is no active session.
Upvotes: 3