How to improve throughput by parallelizing database access?

Question

I'm collecting data from MongoDB using the following method:

public IEnumerable GetMatchingJobs(JobInfoFilterParameters filterParameters, FetchOptions fetchOptions)
{
    var filter = CreateFilterDefinition(filterParameters);
    var options = CreateFindOptions(fetchOptions, false);

    return MongoDatabaseStorageService.WithRecords>
    (collection => collection.FindAsync(filter, options).Result.ToListAsync().Result);
}

/// 
/// Performs an operation on the database, automatically disconnecting and retrying
/// if the database connection drops.
/// 
/// 
public TResult WithRecords(Func, TResult> operation)
{
    try { }
    finally { _lock.EnterUpgradeableReadLock(); }
    try
    {
        return WithRecordsInternal(operation);
    }
    finally
    {
        _lock.ExitUpgradeableReadLock();
    }
}

private TResult WithRecordsInternal(Func, TResult> operation)
{
    try
    {
        return operation(GetCollection());
    }
    catch (IOException)
    {
        // There is an issue in mongo drivers when IOException is thrown instead of reconnection attempt.
        // Try repeat operation, it will force mongo to try to reconnect.
        return operation(GetCollection());
    }
}

I was wondering about using async operation like FindAsync() and ToListAsync() with .Result

How can I improve performance (or throughput by parallelizing database access) by using async-await or what is the correct pattern to use async correcty (if I broke it)?

usr · Accepted Answer

You cannot improve database access throughput with async IO. All this does is change the way the call is initiated and completed. The exact same data is transmitted over the network.

You might be able to improve throughput by parallelizing database access but that is independent of async IO.

collection.FindAsync(filter, options).Result.ToListAsync().Result

Here, you are getting the worst possible perf: Higher call overhead due to async, then blocking which is again costly. If this was a good idea, the library would just do this pattern internally for you.

How to improve throughput by parallelizing database access?

Answers (2)

Related Questions