Ren
Ren

Reputation: 1503

Asp.Net MVC 5 - Long Running Task - How to ensure that worker thread won't be thrown away when IIS recycles the AppPool?

I have a data processing MVC application that works with uploaded file sizes ranging from 100MB to 2GB and contains a couple of long running operations. Users will upload the files and the data in those files will be processed and then finally some analysis on the data will be sent to related users/clients.

It will take least a couple of hours to process the data, so in order to make sure the user doesn't have to wait all the way, I've spun up a separate task to do this long running operation. This way, once the files are received by the server and stored on the disk, the user will get a response back with a ReferenceID and they can close the browser.

So far, it's been working well as intended but after reading up on issues with using Fire-and-Forget pattern in MVC and worker threads getting thrown away by IIS during recycling, I have concerns about this approach.

Is this approach still safe? If not, How can I ensure that the thread that is processing the data doesn't die until it finishes processing and sends the data to clients? (in a relatively simpler way)

The app runs on .NET 4.5, so don't think I will be able to use HostingEnvironment.QueueBackgroundWorkItem at the moment.

Does using Async/Await at controller help?

I've also thought of using a message queue on app server to store messages once the files are stored to disk and then making the DataProcessor a separate service/Process and then listen to the queue. If the queue is recoverable, then it will assure me that the messages will always get processed eventually even if the server crashes or the thread gets thrown away before finish processing the data. Is this a better approach?

My current setup is something like below

Controller

public ActionResult ProcessFiles() 
{    
    HttpFileCollectionBase uploadedfiles = Request.Files;    

    var isValid = ValidateService.ValidateFiles(uploadedFiles);

    if(!isValid){
        return View("Error");
    }

    var referenceId = DataProcessor.ProcessData(uploadedFiles);

    return View(referenceId);    
}

Business Logic

public Class DataProcessor 
   {    
     public int ProcessFiles(HttpFileCollectionBase uploadedFiles) 
     {    
      var referenceId = GetUniqueReferenceIdForCurrentSession();

      var location = SaveIncomingFilesToDisk(referenceId, uploadedFiles);

      //ProcessData makes a DB call and takes a few hours to complete. 

      TaskFactory.StartNew(() => ProcessData(ReferenceId,location))
                 .ContinueWith((prevTask) => 
      {
         Log.Info("Completed Processing. Carrying on with other work");

         //Below method takes about 30 mins to an hour
         SendDataToRelatedClients(ReferenceId);  
      }    
      return referenceId;
     }

   }

References

http://blog.stephencleary.com/2014/06/fire-and-forget-on-asp-net.html

Apppool recycle and Asp.net with threads?

Upvotes: 1

Views: 4411

Answers (2)

Aaron Hudon
Aaron Hudon

Reputation: 5839

No, it is not safe. Create a service application on your server that handles these requests and publishes the result. If you are hosted on Azure, take advantage of their WebJob service.

Upvotes: 1

Stephen Cleary
Stephen Cleary

Reputation: 456407

Is this approach still safe?

It was never safe.

Does using Async/Await at controller help?

No.

The app runs on .NET 4.5, so don't think I will be able to use HostingEnvironment.QueueBackgroundWorkItem at the moment.

I have an AspNetBackgroundTasks library that essentially does the same thing as QueueBackgroundWorkItem (with minor differences). However...

I've also thought of using a message queue on app server to store messages once the files are stored to disk and then making the DataProcessor a separate service/Process and then listen to the queue. If the queue is recoverable, then it will assure me that the messages will always get processed eventually even if the server crashes or the thread gets thrown away before finish processing the data. Is this a better approach?

Yes. This is the only reliable approach. It's what I call the "proper distributed architecture" in my blog post.

Upvotes: 4

Related Questions