user2725090
user2725090

Reputation: 1

Trigger Azure data factory pipeline - Blob upload ADLS Gen2 (programmatically)

We are uploading files into Azure data lake storage using Azure SDK for java. After uploading a file, Azure data factory needs to be triggered. BLOB CREATED trigger is added in a pipeline. Main problem is after each file upload it gets triggered twice.

To upload a file into ADLS gen2, azure provides different SDK than conventional Blobstorage.

SDK uses package - azure-storage-file-datalake.

DataLakeFileSystemClient - to get container

DataLakeDirectoryClient.createFile - to create a file. //this call may be raising blob created event

DataLakeFileClient.uploadFromFile - to upload file //this call may also be raising blob created event

I think ADF trigger is not upgraded to capture Blob created event appropriately from ADLSGen2.

Any option to achieve this? There are restrictions in my org not to use Azure functions, otherwise Azure functions can be triggered based on Storage Queue message or Service bus message and ADF pipeline can be started using data factory REST API.

Upvotes: 0

Views: 1659

Answers (1)

Leon Yue
Leon Yue

Reputation: 16401

You could try Azure Logic Apps with a blob trigger and a data factory action: enter image description here

Trigger: When a blob is added or modified (properties only):

  • This operation triggers a flow when one or more blobs are added or modified in a container. This trigger will only fetch the file metadata. To get the file content, you can use the "Get file content" operation. The trigger does not fire if a file is added/updated in a subfolder. If it is required to trigger on subfolders, multiple triggers should be created.

Action: Get a pipeline run

  • Get a particular pipeline run execution

Hope this helps.

Upvotes: 1

Related Questions