Reputation: 1522
I have two list objects,Products and Prices ,
var products = GetProdusts(predicate);
var prices = GetAllProductPrices();
For fill Product.Prices with prices object :
Parallel.ForEach(products ,prod=> {
prod.Prices= prices?.Where(pr =>pr.ProductId == prod.Id)?.ToList();
});
This loop take a long time. Can anyone help me how to improve this.
Update:
Base on @Holger comment:
var priceDics = prices.GroupBy(p=>p.ProductId).ToDictionary(p=>p.Key ,p=>p.ToList());
Parallel.ForEach(products ,prod=> {
prod.Prices= priceDics?.Where(pr =>pr.Key== prod.Id)?.SelectMany(x=>x.Value).ToList();
});
and now performance is very improved.
Upvotes: 0
Views: 360
Reputation: 43399
Here is a simple solution that uses a Lookup
. It should be so fast that parallelism can't make it any faster.
var lookup = prices.ToLookup(p => p.ProductId);
foreach (var product in products)
{
product.Prices = lookup[product.Id]?.ToList();
}
There is no much calculation happening inside the loop. The lookup[...]
getter is almost instantaneous, and what is left is allocating memory for the lists, and copying the contents of each Grouping
to a new List
, which is an highly efficient, one-CPU-instruction, operation.
Update: Performance test with 500,000 products and 11,000,000 prices:
class Price { public int ProductId; }
class Product { public int Id; public List<Price> Prices; }
var products = Enumerable.Range(1, 500_000)
.Select(n => new Product() { Id = n }).ToList();
var prices = Enumerable.Range(1, 11_000_000)
.Select(n => new Price { ProductId = n % products.Count }).ToList();
var stopwatch = Stopwatch.StartNew();
var lookup = prices.ToLookup(p => p.ProductId);
Console.WriteLine($"Duration Lookup: {stopwatch.ElapsedMilliseconds:#,0} msec");
foreach (var product in products)
{
product.Prices = lookup[product.Id]?.ToList();
}
Console.WriteLine($"Duration Total: {stopwatch.ElapsedMilliseconds:#,0} msec");
Output:
Duration Lookup: 4,051 msec
Duration Total: 4,695 msec
The slowest part of the process is populating the Lookup
, which is not parallelizable. The final loop of assigning prices to products can be parallelized, but it takes less than a second.
Upvotes: 2