Rasik
Rasik

Reputation: 2420

Concurrency Issue with DbContext in ASP.NET Core Application

In the application, HomeController runs a code to perform OCR (Optical Character Recognition) on book chapters using the BackgroundJobService.

The HomeController is defined as follows:

public class HomeController : Controller
{
    private readonly ILogger<HomeController> _logger;
    private readonly BookStoreContext _context;
    private readonly IBackgroundJobService _jobService;

    public HomeController(ILogger<HomeController> logger, BookStoreContext context, IBackgroundJobService jobService)
    {
        _context = context;
        _logger = logger;
        _jobService = jobService;
    }

    public async Task<IActionResult> RunOCR()
    {
        await _jobService.RunOCRJob();
        return View();
    }
}

The BackgroundJobService class performs the OCR and database operations:

public class BackgroundJobService : IBackgroundJobService
{
    private readonly ILogger<BackgroundJobService> _logger;
    private readonly BookStoreContext _context;
    private readonly IFileHandleService _fileService;
    private readonly IOcrService _ocrService;

    public BackgroundJobService(BookStoreContext context, IOcrService ocrService, ILogger<BackgroundJobService> logger)
    {
        _logger = logger;
        _context = context;
        _ocrService = ocrService;
    }

    public async Task RunOCRJob()
    {
        _logger.LogInformation("Starting OCR job.");

        var toRunOCR = await _context.BookChapters
            .Include(b => b.Book)
            .Include(t => t.Content)
            .Where(c => !c.IsOcrRequested && c.Book.ChapterExtracted)
            .OrderBy(o => o.Id)
            .Take(20)
            .ToListAsync();

        var ocrTasks = new List<Task>();

        foreach (var chapter in toRunOCR)
        {
            var bookChapterVm = new ChapterOCRViewModel(chapter.Id, chapter.Path);
            ocrTasks.Add(RunBackgroundJob(bookChapterVm));
        }

        await Task.WhenAll(ocrTasks);

        await _context.SaveChangesAsync();

        _logger.LogInformation("OCR job completed.");
    }

    public async Task RunBackgroundJob(ChapterOCRViewModel chapter)
    {
        try
        {
            var chapterFilePath = _fileService.GetChapterFilePath(chapter.Path);
            var pageContent = await _ocrService.ReadTextFromPdfAsync(chapterFilePath);

            var chapterEntity = await _context.BookChapters
                .Include(cont => cont.Content)
                .FirstOrDefaultAsync(c => c.Id == chapter.Id);

            if (chapterEntity.Content == null)
                chapterEntity.Content = new ChapterContent();

            chapterEntity.Content.Text = pageContent;
            chapterEntity.OcrAt = DateTime.UtcNow;
            chapterEntity.Status = StatusEnum.ScheduledIndex;
            chapterEntity.IsOcrRequested = true;

            _logger.LogInformation($"OCR for Chapter ID: {chapter.Id} completed successfully.");
        }
        catch (Exception ex)
        {
            _logger.LogError($"Error while OCR for Chapter ID: {chapter.Id}. Error message: {ex.Message}");
        }
    }
}

The issue I am facing is that when I execute the RunOCR action in the HomeController, which calls the RunOCRJob method in the BackgroundJobService, I encounter the following error:

fail: Microsoft.EntityFrameworkCore.Query[10100] An exception occurred while iterating over the results of a query for context type 'OTTT.Db.BookStoreContext'. System.InvalidOperationException: A second operation was started on this context instance before a previous operation was completed. This is usually caused by different threads concurrently using the same instance of DbContext. For more information on how to avoid threading issues with DbContext, see https://go.microsoft.com/fwlink/?linkid=2097913.

I have used Take on queries due to that it gives me just 20 records, and while processing the 20 I get an error in 2 or 3 only.

I understand that the error is related to DbContext concurrency, and I suspect it might be caused by the multiple background tasks accessing the same instance of the BookStoreContext. I have been using Dependency Injection to manage the DbContext's lifetime, but the issue persists.

I am stuck on how to resolve this concurrency issue and safely use DbContext within my background job.

Edit used the db context factory but got the same error:

public class BackgroundJobService : IBackgroundJobService
{
    private readonly IServiceProvider _serviceProvider;
//...
    public BackgroundJobService(IDbContextFactory contextFactory, IFileHandleService fileService, BookStoreContext context, IOcrService ocrService,
    ILogger<BackgroundJobService> logger, IServiceProvider serviceProvider)
    {
        _logger = logger;
        _context = context;
        _fileService = fileService;
        _ocrService = ocrService;
        _contextFactory = contextFactory;
    }
    public async Task RunOCRJob()
    {
        _logger.LogInformation("Starting OCR job.");

        using (var context = _contextFactory.Create())
        {
            var toRunOCR = await context.BookChapters
                                .Include(b => b.Book)
                                .Include(t => t.Content)
                                .Where(c => !c.IsOcrRequested && c.Book.ChapterExtracted)
                                .OrderBy(o => o.Id)
                                .Take(20)
                                .ToListAsync();
    //...
            await context.SaveChangesAsync();
        }

        _logger.LogInformation("OCR job completed.");
    }

    public async Task RunBackgroundJob(BookStoreContext context, ChapterOCRViewModel chapter)
    {
        try
        {
    // removed code

            var chapterEntity = await context.BookChapters
                .Include(cont => cont.Content)
                .FirstOrDefaultAsync(c => c.Id == chapter.Id);
    
    // removed code


        }
        catch (Exception ex)
        {
            _logger.LogError($"Error while OCR for Chapter ID: {chapter.Id}. Error message: {ex.Message}");
        }
    }

Upvotes: 0

Views: 653

Answers (2)

Guru Stron
Guru Stron

Reputation: 143098

There are several problems here:

RunBackgroundJob returns an already started task, so the following:

foreach (var chapter in toRunOCR)
{
    var bookChapterVm = new ChapterOCRViewModel(chapter.Id, chapter.Path);
    ocrTasks.Add(RunBackgroundJob(bookChapterVm));
}

just creates toRunOCR.Count() of potentially parallel queries on the same context. As a quick fix use db context factory/scope there too but in general I highly recommend to just look into implementing the job using hosted services - see the Queued background tasks section of the docs. In this case use some kind of concurrent queue where you will put some job descriptors (in this case ChapterOCRViewModel seems to be a good candidate) and process it in the background job via hosted service.

Such implementation will allow to control concurrency of the processing (current implementation, even if it would worked could lead to an unbounded parallel requests to database).

Few notes on the potential hosted service implementation. Hosted services are registered as singletons with all the consequences. Usually prefer not to inject non-singleton dependencies (scoped should not be even allowed as far as I remember) into the background job implemented via hosted service directly, since hosted service is a singleton such dependencies will become captive ones (which, for example in case of db context can lead to a lot of problems with correctness and performance). Inject IServiceScopeFactory and resolve corresponding services on "per iteration" basis - check out this answer.

Upvotes: 0

Ruikai Feng
Ruikai Feng

Reputation: 11896

Dbcontext is not thread safe,you could check this document related

Using dependency injection, this can be achieved by either registering the context as scoped, and creating scopes (using IServiceScopeFactory) for each thread, or by registering the DbContext as transient (using the overload of AddDbContext which takes a ServiceLifetime parameter).

Upvotes: 0

Related Questions