Chibako
Chibako

Reputation: 49

using DbContext in Task does not work properly and only works once and crash if used twice

i am working on a Winui/c# app with EF core and sqlite db. i am searching in a db with multiple tables every tables contains 9M records (9,000,000).

for this heavy search i used Task.Run and async/await. now there is an issue, when i search for a data i can see that app is working fine and result is shown. but if i do another search i get some exception:

Exception thrown: 'System.InvalidOperationException' in WinRT.Runtime.dll Exception thrown: 'System.ArgumentOutOfRangeException' in WinRT.Runtime.dll

i puted all codes into Dispatcher.TryEnqueue and issue fixed, however Task.Run does not have any effect and ui is blocking.

so how i can fix this? it seems that db can not work properly in Task.Run

await Task.Run(async () =>
 {
     await PerformSearchAsync(query, progress, cancellationTokenSource.Token);
 });

private Dictionary<TableType, Func<DbContext, IQueryable<object>>> tableMappings = new Dictionary<TableType, Func<DbContext, IQueryable<object>>>
{
    { TableType.T935, db => db.Set<Person935>() },
    { TableType.T936, db => db.Set<Person936>() },
    { TableType.T937, db => db.Set<Person937>() },
    { TableType.T938, db => db.Set<Person938>() },
    { TableType.T939, db => db.Set<Person939>() },
    { TableType.T93033, db => db.Set<Person93033>() }
};

public async Task PerformSearchAsync(string query, IProgress<int> progress, CancellationToken cancellationToken)
{
    var db = new IFDBDbContext();

    var tableQueries = new List<IAsyncEnumerable<object>>();
    
    var tableType = GetTableType(SearchQuery);

    if (tableMappings.TryGetValue(tableType, out var dbQuery))
    {
        var dbSet = dbQuery(db);
        tableQueries.Add(GetSearchResultsAsync(dbSet, query, cancellationToken));
    }

    // Track progress
    int totalTables = tableQueries.Count;
    int completedTables = 0;
    int completedItems = 0;

    foreach (var tableQuery in tableQueries)
    {
        await foreach (var item in tableQuery.WithCancellation(cancellationToken))
        {
            completedItems++;

            dispatcherQueue.TryEnqueue(() =>
            {
                DataList.Add(item);
                _tmpCompletedItems = completedItems;
            });
        }

        completedTables++;
        progress.Report((completedTables * 100) / totalTables);
    }

    dispatcherQueue.TryEnqueue(() =>
    {
        ShowStatus(DataList.Count);
    });
}

private async IAsyncEnumerable<object> GetSearchResultsAsync(IQueryable<object> dbSet, string query, [EnumeratorCancellation] CancellationToken cancellationToken)
{
    await foreach (var item in dbSet
        .Where(x => EF.Property<string>(x, nameof(BasePerson.Mobile)) != null && EF.Property<string>(x, nameof(BasePerson.Mobile)).Contains(query))
        .AsAsyncEnumerable()
        .WithCancellation(cancellationToken))
    {
        yield return item;
    }
}

Upvotes: 0

Views: 98

Answers (1)

Srinivasan Jayakumar
Srinivasan Jayakumar

Reputation: 72

I have written many such code like this for optimization. I don't have the whole code to analyze, I would suggest to make the following changes.

  1. First and foremost, split individual table to return batches of daat like 50K records per fetch. To do this by wirting a method that will return Task Tuple<string, List<Tuple<batch query, List>>. This was you can split the 9M records in to batches and get the result in list. For this you need a Identifier colum to split the data in the table to batches.

This is just a sample of how I see the code might be bit off but I hope you can understnd the logic

Classes

public class Records
{
}
public class QueryResult
{
    public QueryResult(string query, List<Records> results)
    {
        this.QueryToExecute = query;
        this.Results = results;
    }
    public string QueryToExecute { get; set; }
    public List<Records> Results { get; set; } = new List<Records>();

    public string FailedMessage { get; set; } = string.Empty;
}

Function

public async void Hello()
{
    var queryDict = new ConcurrentDictionary<string, List<QueryResult>>();
    queryDict.TryAdd("Table1", new List<QueryResult>()
    {
        new QueryResult("Select XXX FROM Table1 WHERE ID>1 AND ID<50000", new List<Records>()),
        new QueryResult("Select XXX FROM Table1 WHERE ID>50001 AND ID<100000", new List<Records>()),
        new QueryResult("Select XXX FROM Table1 WHERE ID>100000 AND ID<150000", new List<Records>()),
        new QueryResult("Select XXX FROM Table1 WHERE ID>150001 AND ID<200000", new List<Records>()),
    });
    queryDict.TryAdd("Table2", new List<QueryResult>()
    {
        new QueryResult("Select XXX FROM Table2 WHERE ID>1 AND ID<50000", new List<Records>()),
        new QueryResult("Select XXX FROM Table2 WHERE ID>50001 AND ID<100000", new List<Records>()),
        new QueryResult("Select XXX FROM Table2 WHERE ID>100000 AND ID<150000", new List<Records>()),
        new QueryResult("Select XXX FROM Table2 WHERE ID>150001 AND ID<200000", new List<Records>()),
    });

    foreach (var record in queryDict.Values) 
    {
        foreach (var item in record)
        {
            await Task.WhenAll(
            Task.Run(() => RunQuery(item)),
            Task.Run(() => RunQuery(item)));
        }
    }
}

private void RunQuery(QueryResult queryToExecute)
{
    // Execute queryusing EF
}
  1. Once the data is split have a loop to run the batches per table using Task.WhenAll()
  2. Always create the DB Context one for each of the Table and dispose it when all the data is read.

Upvotes: 0

Related Questions