Sergey Konotop
Sergey Konotop

Reputation: 121

Managed Service Identity configuration for Azure Data Lake Storage Gen2

Trying to connect to Azure Data Lake storage Gen2 using MSI (Azure Managed Identity) via Hadoop client in console and receive the error ls: AADToken: HTTP connection failed for getting token from AzureAD. Http response: 400 Bad Request*

Connection via Shared Key works fine.

What was done:

  1. Created a Windows 10 VM in Azure and installed Haddop client 3.2 from Apache site and JRE 1.8.0
  2. Created Storage account using https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-quickstart-create-account
  3. Created Azure AD application using https://learn.microsoft.com/en-us/azure/active-directory/develop/howto-create-service-principal-portal
  4. Turned on System-assigned managed identity for VM as described here https://learn.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/qs-configure-portal-windows-vm
  5. Assigned a managed identity access to the Storage account as described here https://learn.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/howto-assign-access-portal

To connect using a command below:

hadoop fs -Dfs.azure.ssl.channel.mode=Default_JSSE -Dfs.azure.account.oauth.provider.type=org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider -Dfs.azure.account.auth.type=OAuth -Dfs.azure.account.oauth2.msi.tenant=<tenant_ID> -Dfs.azure.account.oauth2.client.id=<Client_ID> -ls abfss://<filesystem_name>2@<storage_account_name>.dfs.core.windows.net/

Something wrong or missed? Please advice.

Thank you!

Upvotes: 1

Views: 1940

Answers (1)

Joy Wang
Joy Wang

Reputation: 42063

Add my comment as an answer:

No need to do the step 3, if you enable the MSI of the VM, it will create a service principal in your tenant automatically, it is the same name of your VM, it has its own client id. You can find it in the Azure Active Directory in the portal -> Enterprise applications-> search with your VM name(filter with All Applications).

In the step 5, you need to give the MSI a Storage Blob Data Owner role.

Upvotes: 3

Related Questions