Reputation: 1412
I want to process CSV file when it is uploaded in blob storage. For this requirement I am writing Web Job with blob trigger.
To make sure continuous CSV processing, I am writing one more web job with blob trigger.
So if one web job fails another web job will process the csv.
Now, my problem is when both the web jobs are running they are processing the same CSV file and end up creating the duplicate data.
How I lock the file so only one web job will process the CSV file?
Or
How can I trigger second web job if first web job is going to shut down?
Upvotes: 0
Views: 305
Reputation: 7686
I like Amor's solution, but have a few suggestions to add to it.
If you abandon the BlobTrigger approach and instead enqueue a Service Bus Queue Message indicating the blob that needs to be processed, you can trigger your processing with a ServiceBusTrigger. In the event that an exception occurs, abandon the message and it will be available for another processing attempt. This would let you only have one webjob and still have redundancy.
The other advantage of using a Service Bus queue is that you can get guaranteed at least once and at most once processing along with guaranteed message locking when a message is read. This is not the case with a standard Storage queue. This would also give you a scalability option in the future if you wanted to add a second Webjob instance to monitor the same service bus queue.
Upvotes: 0
Reputation: 8491
How can I trigger second web job if first web job is going to shut down?
I suggest you use try-catch to handle the exception in your first WebJob. If any exception occurs, we could write the blob name to queue to trigger the other WebJob.
public static void ProcessCSVFile([BlobTrigger("input/{blobname}.csv")] TextReader input, [Queue("myqueue")] out string outputBlobName, string blobname)
{
try
{
//process the csv file
//if none exception occurs, set the value of outputBlobName to null
outputBlobName = null;
}
catch
{
//add the blob name to a queue and another function named RepeatProcessCSVFile will be triggered.
outputBlobName = blobname;
}
}
We could create a QueueTrigger function in the other WebJob. In this function, we could read out the blob name and re-process the csv. If a new exception occurs, we also could re-add the blob name to the queue and this function will be executed again and again until the csv file has been processed successfully.
public static void RepeatProcessCSVFile([QueueTrigger("myqueue")] string blobName, [Queue("myqueue")] out string outputBlobName)
{
try
{
//process the csv file
//if none exception occurs, set the value of outputBlobName to null.
outputBlobName = null;
}
catch
{
//re-add the blobName to the queue and this function will be executed again until the csv file has been handled successfully.
outputBlobName = blobName;
}
}
Upvotes: 2