Reputation: 1694
More of a concept question... Is there any reason it would be a bad idea to make async calls to my DataContext within a foreach loop such as this?
private async Task ProcessItems(List<Item> items)
{
var modifiedItems = new List<modifiedItem>();
foreach (var item in items)
{
// **edited to reflect link between items and properties**
// var properties = await _context.Properties
// .Where(p => p.Condition == true).ToListAsync();
var properties = await _context.Properties
.Where(p => p.Condition == item.Condition).ToListAsync();
foreach (var property in properties)
{
// do something to modify 'item'
// based on the value of 'property'
// save to variable 'modifiedItem'
modifiedItems.Add(modifiedItem)
}
}
await _context.ModifiedItems.AddRangeAsync(modifiedItems);
await _context.SaveChangesAsync();
}
Since the inner foreach loop is dependent upon the properties
variable, does it not start until the properties
variable has been fully instantiated?
Being that the modifiedItems
variable is declared outside of the parent foreach loop, is it a bad idea to asynchronously add each modifiedItem
to the modifiedItems
list?
Are there any properties/methods in Entity Framework, Linq, whatever that would be better suited for this kind of task? Better idea than doing embedded foreach loop?
(In case anyone wants some context... IRL, items
is a list of readings from sensors. And properties
are mathematical equations to convert the raw readings to meaningful data such as volume and weight in different units... then those calculated data points are being stored in a database.)
Upvotes: 2
Views: 2919
Reputation: 30454
Yes, this would be a bad idea.
The method you propose, performs one database query per item
in your sequence of items
. Apart from that this is less efficient, it also might result in an incoherent result set. Someone might have changed the database between your first and your last query, so the first returned object belongs to a different database state than the result from the last query.
A simplified example: Let's query persons, with the name of the company they work for:
Table of Companies:
Id Name
1 "Better Bakery"
2 "Google Inc"
Table of Persons
Id Name CompanyId
1 "John" 1
2 "Pete" 1
Suppose I want the Id
/ Name
/ CompnanyName
of Persons
with Id {1, 2}
Between your first and your second query, someone changes the name of the bakery into Best Bakery Your result would be:
PersonId Name CompanyName
1 "John" "Better Bakery"
2 "Pete" "Best Bakery"
Suddenly John and Pete work for different companies?
And what would be the result if you'd ask for Persons
with Id {1, 1}? The same person would work for two different companies?
Surely this is not what you want!
Beside that the result would be incorrect, your database can improve it's queries if it knows that you want the result of several items. The database creates temporary tables to calculate the result. These tables would have to be generated only once.
So if you want one consistent set of data that represents the state of the data at the moment of your query, create your data using one query:
The same result in one database query:
var itemConditions = items.Select(item => item.Condition);
IQueryable<Property> queryProperties = dbContext.Properties
.Where(property => itemConditions.Contains(property.Condition);
In words: From every item
in your input sequence of Items (in your case a list), take the property
Condition`.
Then query the database table Properties
. Keep only those rows in this table that have a Condition
that is one of the Condidions
in itemConditions
.
Note that no query or enumeration has been performed yet. The database has not been accessed. Only the Expression
in the IQueryable
has been modified.
To perform the query:
List<Property> fetchedProperties = await queryProperties.ToListAsync();
If you want, you could concatenate these statements into one big LINQ statement. I doubt whether this would improve performance, it surely would deteriorate readability, and thus testability / maintainability.
Once you've fetched all desired properties in one database query, you can change them:
foreach (Property fetchedProperty in fetchedProperties)
{
// change some values
}
await dbContext.SaveChangesAsync();
As long as you keep your dbContext
alive, it will remember the state of all fetched rows from all tables in dbContext.ChangeTracker.Entries
. This is a sequence of DbEntityEntries
. One object per fetched row. Every DbEntityEntry
holds the original fetched value and the modified value. They are used to identify which items are changed and thus need to be saved in one database transaction during Savechanges()
Hence it is seldom needed to actively notify your DbContext
that you changed a fetched object
The only use case I see, is when you want to change an object without first adding or fetching it, which might be dangerous, as some of the properties of the object might be changed by someone else.
You can see this in
Upvotes: 0
Reputation: 239646
In this specific case, I'd instead write it to load a single list of all properties for all sensors of interest (considering all items
and based on you macAddress
/sensorKey
properties you've mentioned), and store that in a list. We'll call that allProperties
. We await
that once and avoid making repeated database calls.
Then, use LINQ To Objects to join your items
to the objects in allProperties
that they match. Iterate the results of that join and there's no await
to do inside the loop.
Upvotes: 1
Reputation: 10929
No there is none, however you miss some concept here.
Since you use an async method, the ProcessItems
method should be called ProcessItemsAsync
and return a task
.
this would be of use to you: Async/Await - Best Practices in Asynchronous Programming
depends on your needs, it is recommended to add CancellationToken
and consider exception handling as well, just becareful to not swallow exceptions.
Upvotes: 4
Reputation: 2320
There is no issue here using async, once you await an async call the returned object is the same.
You may want to rethink executing a DB call in a foreach if you can run it once outside the loop, and filter the data in memory to act on it. Each use case is different and you have to make sure the larger return set can be handled in memory.
Usually getting a 1000 rows from a DB once is faster than 100 rows 10 times.
Upvotes: 1