JED
JED

Reputation: 1694

async calls in foreach loops

More of a concept question... Is there any reason it would be a bad idea to make async calls to my DataContext within a foreach loop such as this?

private async Task ProcessItems(List<Item> items)
{
    var modifiedItems = new List<modifiedItem>();

    foreach (var item in items)
    {
        // **edited to reflect link between items and properties**
        // var properties = await _context.Properties
        //   .Where(p => p.Condition == true).ToListAsync();
        var properties = await _context.Properties
          .Where(p => p.Condition == item.Condition).ToListAsync();

        foreach (var property in properties)
        {
            // do something to modify 'item'
            // based on the value of 'property'
            // save to variable 'modifiedItem'

            modifiedItems.Add(modifiedItem)
        }
    }

    await _context.ModifiedItems.AddRangeAsync(modifiedItems);
    await _context.SaveChangesAsync();
}

Since the inner foreach loop is dependent upon the properties variable, does it not start until the properties variable has been fully instantiated?

Being that the modifiedItems variable is declared outside of the parent foreach loop, is it a bad idea to asynchronously add each modifiedItem to the modifiedItems list?

Are there any properties/methods in Entity Framework, Linq, whatever that would be better suited for this kind of task? Better idea than doing embedded foreach loop?

(In case anyone wants some context... IRL, items is a list of readings from sensors. And properties are mathematical equations to convert the raw readings to meaningful data such as volume and weight in different units... then those calculated data points are being stored in a database.)

Upvotes: 2

Views: 2919

Answers (4)

Harald Coppoolse
Harald Coppoolse

Reputation: 30454

Yes, this would be a bad idea.

The method you propose, performs one database query per item in your sequence of items. Apart from that this is less efficient, it also might result in an incoherent result set. Someone might have changed the database between your first and your last query, so the first returned object belongs to a different database state than the result from the last query.

A simplified example: Let's query persons, with the name of the company they work for:

Table of Companies:
Id    Name
 1    "Better Bakery"
 2    "Google Inc"

Table of Persons
Id    Name    CompanyId
 1    "John"     1
 2    "Pete"     1

Suppose I want the Id / Name / CompnanyName of Persons with Id {1, 2}

Between your first and your second query, someone changes the name of the bakery into Best Bakery Your result would be:

PersonId  Name     CompanyName
    1    "John"   "Better Bakery"
    2    "Pete"   "Best Bakery"

Suddenly John and Pete work for different companies?

And what would be the result if you'd ask for Persons with Id {1, 1}? The same person would work for two different companies?

Surely this is not what you want!

Beside that the result would be incorrect, your database can improve it's queries if it knows that you want the result of several items. The database creates temporary tables to calculate the result. These tables would have to be generated only once.

So if you want one consistent set of data that represents the state of the data at the moment of your query, create your data using one query:

The same result in one database query:

var itemConditions = items.Select(item => item.Condition);

IQueryable<Property> queryProperties = dbContext.Properties
    .Where(property => itemConditions.Contains(property.Condition);

In words: From every item in your input sequence of Items (in your case a list), take the propertyCondition`.

Then query the database table Properties. Keep only those rows in this table that have a Condition that is one of the Condidions in itemConditions.

Note that no query or enumeration has been performed yet. The database has not been accessed. Only the Expression in the IQueryable has been modified.

To perform the query:

List<Property> fetchedProperties = await queryProperties.ToListAsync();

If you want, you could concatenate these statements into one big LINQ statement. I doubt whether this would improve performance, it surely would deteriorate readability, and thus testability / maintainability.

Once you've fetched all desired properties in one database query, you can change them:

foreach (Property fetchedProperty in fetchedProperties)
{
      // change some values
}
await dbContext.SaveChangesAsync();

As long as you keep your dbContext alive, it will remember the state of all fetched rows from all tables in dbContext.ChangeTracker.Entries. This is a sequence of DbEntityEntries. One object per fetched row. Every DbEntityEntry holds the original fetched value and the modified value. They are used to identify which items are changed and thus need to be saved in one database transaction during Savechanges()

Hence it is seldom needed to actively notify your DbContext that you changed a fetched object

The only use case I see, is when you want to change an object without first adding or fetching it, which might be dangerous, as some of the properties of the object might be changed by someone else.

You can see this in

Upvotes: 0

Damien_The_Unbeliever
Damien_The_Unbeliever

Reputation: 239646

In this specific case, I'd instead write it to load a single list of all properties for all sensors of interest (considering all items and based on you macAddress/sensorKey properties you've mentioned), and store that in a list. We'll call that allProperties. We await that once and avoid making repeated database calls.

Then, use LINQ To Objects to join your items to the objects in allProperties that they match. Iterate the results of that join and there's no await to do inside the loop.

Upvotes: 1

Barr J
Barr J

Reputation: 10929

No there is none, however you miss some concept here.

Since you use an async method, the ProcessItems method should be called ProcessItemsAsync and return a task.

this would be of use to you: Async/Await - Best Practices in Asynchronous Programming

depends on your needs, it is recommended to add CancellationToken and consider exception handling as well, just becareful to not swallow exceptions.

Upvotes: 4

Richard Hubley
Richard Hubley

Reputation: 2320

There is no issue here using async, once you await an async call the returned object is the same.

You may want to rethink executing a DB call in a foreach if you can run it once outside the loop, and filter the data in memory to act on it. Each use case is different and you have to make sure the larger return set can be handled in memory.

Usually getting a 1000 rows from a DB once is faster than 100 rows 10 times.

Upvotes: 1

Related Questions