Reputation: 53
I am trying to connect MS Azure databricks with data lake storage v2, and not able to match the client, secret scope and key.
I have data in a Azure data lake v2. I am trying to follow these instructions:
I have created a 'service principle' with the role "Storage Blob Data Contributor", obtained
I have created secret scopes in both Azure Keyvault and Databricks with keys and values
when I try the code below, the authentication fails to recognize the secret scope & key. It is not clear to me from the documentation if it is necessary to use the Azure Keyvault or Databricks secret scope.
val configs = Map(
"fs.azure.account.auth.type" -> "OAuth",
"fs.azure.account.oauth.provider.type" -> "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
"fs.azure.account.oauth2.client.id" -> "<CLIENT-ID>",
"fs.azure.account.oauth2.client.secret" -> dbutils.secrets.get(scope = "<SCOPE-NAME>", key = "<KEY-VALUE>"),
"fs.azure.account.oauth2.client.endpoint" -> "https://login.microsoftonline.com/XXXXXXXXXX/oauth2/token")
If anybody could help on this, please advise / confirm:
what should be CLIENT-ID : I understand this to be from the storage account;
where should the SCOPE-NAME and KEY-VALUE be created, in Azure Keyvault or Databricks?
Upvotes: 5
Views: 11971
Reputation: 141
I was facing the same issue , the only thing i did extra was to assign the default permission of the application to datalake gen2's blob container in azure storage explorer . It required the object id of the application , which is not the one available on the UI , it can be taken by using the command "az ad sp show --id " on azure-cli . After assign the permission on blob container, create a new file, and then try to access it,
Upvotes: 0
Reputation: 2473
The XXXX in https://login.microsoftonline.com/XXXXXXXXXX/oauth2/token should be your TenantID (get this from the Azure Active Directory tab in the Portal > Properties > DirectoryID).
The Client ID is the ApplicationID/Service Principal ID (sadly these names are used interchangeably in the Azure world - but they are all the same thing).
If you have not created a service principal yet follow these instructions: https://learn.microsoft.com/en-us/azure/storage/common/storage-auth-aad-app#register-your-application-with-an-azure-ad-tenant - make sure you grant the service principal access to your lake once it is created.
You should create a scope and secret for the Principal ID Key - as this is something you want to hide from free text. You cannot create this in the Databricks UI (yet). Use one of these:
Right now I do not think can create secrets in Azure KeyVault - though I expect to see that in the future. Technically you could manually integrate with Key Vault using their API's but it would give you another headache in needing a secret credential to connect to key vault.
Upvotes: 4