Reputation: 1
I have a pipeline in Azure Data Factory that moves data from Google BigQuery(GBQ) to Azure Data Lake (gen 1) and in between does some cleaning in Azure Databricks.
The first copy activity copies data from GBQ to Data lake, then the data goes through Databricks, and finally, the last activity copies the data to a blob container.
Out of the 4 initial copy activities, one randomly fails with the following error
Failure happened on 'Sink' side. ErrorCode=UserErrorAdlsFileWriteFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Writing to 'AzureDataLakeStore' failed. Path: /.../.../PageTracking_06072021.csv. Message: The remote server returned an error: (403) Forbidden.. Response details: {"RemoteException":{"exception":"AccessControlException","message":" [......] Access Denied : /../../PageTracking_06072021.csv[.....]","javaClassName":"org.apache.hadoop.security.AccessControlException"}},Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.Net.WebException,Message=The remote server returned an error: (403) Forbidden.,Source=System,'
When I run the pipeline again, the failed one succeeds and others fail with the same error.
What I have tried so far.
Tried deleting the files and running fresh, but the first time everything succeeds and the cycle repeats itself. Tried changing the sequence of activities (like you see in the image). I still get the same error randomly.
Access cannot be the issue because the same IR and configuration are being used in all the activities.
Update:
I have set up a trigger (once daily) for the pipeline and the pipeline runs fine. The problem happens only when I try to run the pipeline manually.
Upvotes: 0
Views: 327
Reputation: 3230
Check the credentials provided in the linked service has permissions required to Write to Azure Data Lake Storage folder to which you are writing the file?
The access must be granted from the root folder.
In Storage Explorer, set the permission of the service principal - grant at least Execute permission starting from the sink file system, along with Write permission for the sink folder. Also try, in Access control (IAM), grant at least the Storage Blob Data Contributor role.
Upvotes: 0