BeGreen
BeGreen

Reputation: 951

AML Compute instance and Compute cluster roles rbac

I have a terraform script that will create a compute instance and compute cluster with unique user managed indentity assigned to them. Both of the Managed Indentities have exactly the same Rbac roles.

Storage Account Key Operator Service Role, Storage Blob Data Contributor, Cognitive Services Contributor, Contributor, Network Contributor, Key Vault Contributor, Support Request Contributor, AzureML Data Scientist, Cognitive Services User, Storage Queue Data Contributor, Data Factory Contributor, Cognitive Services Usages Reader, AcrPull, DocumentDB Account Contributor, Key Vault Secrets Officer, AzureML Compute Operator, Reader, Owner, AzureML Registry User, Storage Blob Data Reader, Storage Blob Data Owner, Storage Queue Data Reader,

When trying to read/write from a compute instance to ADLS GEN2, it works correclty. But when I'm on the compute cluster, I get an error:

azure.core.exceptions.HttpResponseError: Operation returned an invalid status 'This request is not authorized to perform this operation.' ErrorCode:AuthorizationFailure

I really don't understand why, because they both have the sames roles. I used the application ID of the compute cluster to give roles on databricks and it worked, so I'm confused on how user managed identity work, and why it doesn't in some cases.

Update: Here is how I use in compute instance and cluster to communicate to ADLS Gen2. First I get the Credentials, then I use the Datalake client.

class AzureFileStorage(metaclass=Singleton):
    """Azure File Storage Class (Singleton)"""

    def __init__(self, storage_account_name: str, file_system: str) -> None:
        """AFS will interact with ADLS Gen2 file system.

        Args:
            storage_account_name (str): storage account name
            storage_account_key (str): storage account key
            file_system (str): file systeme name
        """
        self.storage_account_name = storage_account_name
        self.file_system = file_system
        self.service_client = DataLakeServiceClient(
            account_url=f"https://{self.storage_account_name}.dfs.core.windows.net",
            credential=DefaultAzureCredential(
                managed_identity_client_id=os.environ.get(
                    "DEFAULT_IDENTITY_CLIENT_ID", None
                )
            ),
        )
        self.file_system_client = self.service_client.get_file_system_client(
            file_system=file_system
        )

    def mkdir(self, storage_dir_path: str) -> None:
        """Make directory on Azure File Storage. This method will create parent directories.

        Args:
            storage_dir_path (str): Directory path on Azure File Storage
        """
        if not self.test(storage_dir_path):
            self.file_system_client.create_directory(storage_dir_path)

   # self.test is an other method of this class to check existance of the folder

Update 2: I tried also to use the system assigned method, but no success DEFAULT_IDENTITY_CLIENT_ID

Entreprise Application

RBAC roles

Solution: I had an issue in the configuration of the https_proxy and no_proxy. I forgot to set .blob.core.windows.net in my no_proxy. And it was correct in my compute instance.

Upvotes: 0

Views: 366

Answers (1)

BeGreen
BeGreen

Reputation: 951

The issue was not on RBAC roles or Compute Cluster/Compute Instance managed identity (system or user).

It was actually a network issue. Where we provide a HTTPS_PROXY and NO_PROXY to the dockerfile to create the base image for the experiment that will run in the compute cluster.

We need the HTTPS_PROXY to pip install packages and the NO_PROXY to not passe by proxy for Azure calls.

I forgot to add the domain of .blob.core.windows.net in my NO_PROXY.

ENV https_proxy http://xxxxxxx.com:8080
ENV no_proxy localhost,.blob.core.windows.net,.azuresynapse.net,.table.core.windows.net,.queue.core.windows.net,.file.core.windows.net,.web.core.windows.net,.dfs.core.windows.net,.documents.azure.com,.batch.azure.com,.service.batch.azure.com,.vault.azure.net,.vaultcore.azure.net,.managedhsm.azure.net,.azmk8s.io,.search.windows.net,.azurecr.io,.azconfig.io,.servicebus.windows.net,.azure-devices.net,.servicebus.windows.net,.azure-devices-provisioning.net,.eventgrid.azure.net,.azurewebsites.net,.scm.azurewebsites.net,.api.azureml.ms,.notebooks.azure.net,.instances.azureml.ms,.aznbcontent.net,.inference.ml.azure.com,.cognitiveservices.azure.com,.afs.azure.net,.datafactory.azure.net,.adf.azure.com,.purview.azure.com,.azure-api.net,.developer.azure-api.net,.analysis.windows.net,.azuredatabricks.net,.azurefd.net,.vsblob.vsassets.io,.openai.azure.com

Upvotes: 0

Related Questions