jocull
jocull

Reputation: 21095

Simplifying complex include joins in Entity Framework

Right now I have a nasty complex query that I need to load in a bulk of data for processing. The data nesting runs pretty deep...

public List<agZone> agZone_GetZonesWithProjectionInformation(IEnumerable<int> ids, bool includeHidden)
{
    var zones = this.BaseDB.agZones.Where(x => (includeHidden || includeHidden == x.agField.Hidden)
                                && x.agField.WeatherSourceID.HasValue
                                && x.agField.WeatherProjectionID.HasValue
                                && x.agField.SeedID.HasValue
                                && ids.Contains(x.ZoneID))
                        .Include(x => x.agField)
                        .Include(x => x.agField.agSeed)
                        .Include(x => x.agField.agSeed.agSeedCompany)
                        .Include(x => x.agField.agSeed.agSeedConfigs)
                        .Include(x => x.agField.agSeed.agSeedGrowthStages)
                        .Include(x => x.agField.agSeed.agSeedGrowthStages.Select(y => y.agGrowthStage))
                        .Include(x => x.agField.agFieldGrowthStages)
                        .Include(x => x.agField.agFarm)
                        .Include(x => x.agField.agFarm.agGrower)
                        .Include(x => x.agField.agFarm.agGrower.agUsers)
                        .Include(x => x.agZoneNitrogenApplications)
                        .Include(x => x.agZoneWaterApplications)
                        .Include(x => x.agSoilSymbol)
                        .Include(x => x.agSoilSymbol.agSoilConfigs)
                        .AsNoTracking();

    return zones.ToList();
}

So far I'm aware of two bad things going on here:

  1. The query uses .Contains(...) - Entity Framework 5 supposedly isn't able to compile or cache these queries - so they have to be regenerated every time. That sucks, but I don't have a way around it right now. Even worse, while the query is generated EF seems to block other threads waiting to query. This is really terrible.
  2. The includes greatly increase the query complexity and the amount of data returned because of the repeating master details. The generated SQL command is 2100+ lines long and the execution plan is straight from hell.

There's too much data to lazy load (besides that being just generally terrible) and I'm not aware of a way to make multiple eager loading trips on my zones variable.

Do I need to make multiple queries and then squish the data together myself with foreach loops and/or LINQ? I really do need to have all of this data loaded into memory or my calculations will run terribly. So far this is actually the fastest I can make it. Thanks!

Upvotes: 1

Views: 872

Answers (2)

Dennis Traub
Dennis Traub

Reputation: 51634

An alternative might be to create a SQL view on the database and do a simple EF mapping against that. This alone can largely reduce the performance impact.

You might even want to consider creating a denormalized projection in a separate table or its own data store. Either on the fly whenever related data has been modified, or as a batch job every now and then, depending on how often and how timely you need this data. CQRS could be a good fit for this specific issue.

Upvotes: 1

noviKorisnik
noviKorisnik

Reputation: 28

Do I need to make multiple queries and then squish the data together myself with foreach loops and/or LINQ?

Yes. It could help a lot. Many collections produce large joins which affect execution time. I had the same problem and it is solved with filling collections with additional calls.

Upvotes: 0

Related Questions