user1477388
user1477388

Reputation: 21430

Regarding LINQ Usage in Large Loops

I am wondering what is recommended in the following scenario:

I have a large loop that I traverse to get an ID which I then store in a database like so:

foreach (var rate in rates)
{
    // get ID from rate name
    Guid Id = dbContext.DifferentEntity
        .Where(x => x.Name == rate.Name).FirstOrDefault();

    // create new object with the newly discovered 
    // ID to insert into the database
    dbContext.YetAnotherEntity.Add(new YetAnotherEntity
    {
        Id = Guid.NewGuid(),
        DiffId = Id,
    }
}

Would it be better/ faster to do this instead (first get all DifferentEntity IDs, rather than querying for them separately)?

List<DifferentEntity> differentEntities = dbContext.DifferentEntity;

foreach (var rate in rates)
{
    // get ID from rate name
    Guid Id = differentEntities
        .Where(x => x.Name == rate.Name).FirstOrDefault();

    // create new object with the newly discovered 
    // ID to insert into the database
    dbContext.YetAnotherEntity.Add(new YetAnotherEntity
    {
        Id = Guid.NewGuid(),
        DiffId = Id,
    }
}

Is the difference negligible or is this something I should consider? Thanks for your advice.

Upvotes: 0

Views: 129

Answers (3)

flindeberg
flindeberg

Reputation: 5007

Run them horses! There is really a lot we do not know. Is it possible to keep all the entities in memory? How many of them are duplicates with respect to Name?

A simplistic solution with one fetch from the database and usage of parallelism:

// Fetch entities
var entitiesDict = dbContext.DifferentEntity
     .Distinct(EqualityComparerForNameProperty).ToDictionary(e => e.Name); 

// Create the new ones real quick and divide into groups of 500 
// (cause that horse wins in my environment with complex entities, 
// maybe 5 000 or 50 000 fits your scenario better since they are not that complex?)
var newEnts = rates.AsParallel().Select((rate, index) => {
  new {
        Value = new YetAnotherEntity
           { Id = Guid.NewGuid(), DiffId = entitiesDict[rate.Name],},
        Index = index
     }
  })
.GroupAdjacent(anon => anon.Index / 500) // integer division, and note GroupAdjacent! (not GroupBy)
.Select(group => group.Select(anon => anon.Value)); // do the select so we get the ienumerables

// Now we have to add them to the database
Parallel.ForEach(groupedEnts, ents => {
   using (var db = new DBCONTEXT()) // your dbcontext
   {
     foreach(var ent in ents)       
       db.YetAnotherEntity.Add(ent);

     db.SaveChanges();
   }
});

In general in database scenarios, the expensive stuff is the fetch and commits, so try to keep them to a minimum.

Upvotes: 1

dotNET
dotNET

Reputation: 35380

Store your Rate Names in a sorted string array (string[]) instead of a List or Collection. Then use Array.BinarySearch() to make your search much faster. Rest of what I was going to write has already been written by @Felipe above.

Upvotes: 2

Felipe Oriani
Felipe Oriani

Reputation: 38598

You can decrease the number of queries you are doing in database. For example, take all names and query findind Ids where the names contains.

Try something like this.

// get all names you have in rates list...
var rateNames = rates.Select(x => x.Name).ToList();

// query all Ids you need where contains on the namesList... 1 query, 1 column (Id, I imagine)
var Ids = dbContext.DifferentEntity.Where(x => rateNames.Contains(x.Name).Select(x => x.Id).ToList();

// loop in Ids result, and add one by one
foreach(var id in Ids)
    dbContext.YetAnotherEntity.Add(new YetAnotherEntity
    {
        Id = Guid.NewGuid(),
        DiffId = id,
    }

Upvotes: 0

Related Questions