Reputation: 270
I have a "Continuous Export" process in my Application Insights that create new files based in my new Insights.
Beside that I have a process, using Azure Data Factory, to load one SQL Table with the Blob Storage data.
The problem: I cannot read from A.D.F., only new files from Blob Storage and I'm always processing the same data. At this moment I'm ignoring repeated data after the load process in a SQL Stored Procedure but I want to make this process more efficient by reading only new data from blob storage, can I do this from A.D.F.? Can anyone help me? Which are the alternatives to achieve this?
Best Regards, Rui Fernandes
Upvotes: 0
Views: 2037
Reputation: 63
I reccomend you to archive old blobs programmatically (custom pipeline) by renaming them to "Archive/oldBlobName"! After you do that, when iterating over segmented blob result next time (list of blobs contained in a container you specified in the dataset) in your code, just escape the ones whose name starts with "Archive".
foreach (IListBlobItem listBlobItem in blobList.Results)
{
CloudBlockBlob inputBlob = listBlobItem as CloudBlockBlob;
// Take all blobs from container that are not in "Archive"
if ((inputBlob == null) || string.IsNullOrEmpty(inputBlob.Name)
|| inputBlob.Name.ToLower().StartsWith("Archive"))
{
continue;
}
...
Upvotes: 0
Reputation: 24549
Which are the alternatives to achieve this?
If WebJob is acceptable, we could do that with WebJob blob trigger easily. We could get more info about WebJob trigger from Azure official document.
Following is demo code:
public static void ProcessBlobTrigger([BlobTrigger("containername/{name}")] TextReader input, TextWriter log)
{
// your logic to process data
}
Upvotes: 0