ericOnline
ericOnline

Reputation: 1988

Why does Azure Data Factory use AppendFile instead of PutBlob to write files to blob storage container?

I've got a Log Analytics Workspace stood up and created some (blob) Diagnostic Settings for some Azure Storage Accounts. Now I'm analyzing the blob traffic.

Seems that the various methods of getting blobs into blob storage (Azure Data Factory (ADF), Azure Storage Explorer (ASE), Python SDK, etc.) use different API methods out-of-the-box.

Example:

enter image description here

Question:

Also, I don't see AppendFile listed as a method in the Blob Service REST API docs.

Upvotes: 0

Views: 578

Answers (2)

Neeraj
Neeraj

Reputation: 168

If you are using the Azure Data Factory Flow and the primary id is the same, it usually does the AppendFile operation

Upvotes: 0

Joy Wang
Joy Wang

Reputation: 42123

I can reproduce your issue on my side, I suppose your storage account is a Data Lake Storage Gen2 account i.e. Hierarchical namespace was enabled like below.

enter image description here

When you use the copy activity in ADF to copy blobs between containers(also named filesystem in datalake gen2), it will call the Data Lake Storage Gen2 REST API instead of the normal Storage REST API - Path - Update, if you look into the Uri parameter in the log, you will find its format is like below.

enter image description here

It is the same as the REST API sample, because it essentially calls this API.

enter image description here

Even if it is a Data Lake Storage Gen2 account, the normal Storage REST API also works for it, so if you use something like Azure Storage Explorer, it essentially calls the normal Storage REST API directly i.e. Put blob.

Upvotes: 2

Related Questions