C1X
C1X

Reputation: 805

Entity Framework: retrieve row that is not in a Date range

I have this query:

var query = _repository.GetAllIncluding(x => x.ContractRow, x => x.ContractRow.Contract)
                       .Where(x => (int)x.ContractRow.PeriodicityType == (int)CommonConsts.PeriodicityType.Yearly && 
                                   x.ContractRow.Contract.Date.AddYears(-(x.ContractRow.Period.Value / 2)) > x.DueDate
                                   || x.ContractRow.Contract.Date.AddYears(x.ContractRow.Period.Value / 2) < x.DueDate);

Where:

These types can't be changed.

The problem is in the AddYears() function.

If I use .AddYears(-(2 / 2)) it returns the value I expect, but if I use .AddYears(-(x.ContractRow.Period.Value / 2)), where ContractRow.Period is 2 it shows a different result. Why?

Upvotes: 0

Views: 304

Answers (1)

Steve Py
Steve Py

Reputation: 34793

First off, given that you are using DateTime.AddYears in your expression, that shouts that your repository method is returning IEnumerable<Entity> not IQueryable<Entity> and is executing the EF Linq query with .ToList(). To save you a lot of pain in the future when your database gets big, or you try a similar pattern in a larger project, you really want to avoid this. The problem with this approach is that EF is retrieving all entities with their associated ContractRow and Contract records before you even touch on a Where clause. Against a data table of any significant size with any significant number of concurrent requests, this will absolutely kill your system.

For repository patterns I recommend returning IQueryable<Entity> and avoiding calls like ToList until they are absolutely needed. So a GetAll method would look something like:

public IQueryable<Row> GetAll()
{
   var query = _context.Rows.AsQueryable();
   return query;
}

Note that we do not need to bother with Include() statements etc. Linq queries can happily reference related entities as part of the expression and EF will resolve these automatically. Projecting results using Select() will also resolve related entities. The only time you need Include() is where you specifically want to load and work with the entire entity structure. Typically this would just be Update scenarios. In that case you can add .Include() statements in the query after calling the repository method, you don't need to pass expressions to the method. It also gives you the flexibility to perform .Count(), .Any(), and pagination with .OrderBy().Skip(n).Take(m), etc. (quite simple and flexible)

As for the repository method, the above is a simple example with no base criteria. Repositories provide a good separation point for testing, but also a good base point for common, global rules such as with Soft Delete (IsActive) restrictions, and Authentication/Authorization checks. For instance if you have a soft-delete system and default to active records:

public IQueryable<Row> GetAll(bool includeInactive = false)
{
   var query = includeInactive 
      ? _context.Rows.AsQueryable()
      : _context.Rows.Where(x => x.IsActive);
   return query;
}

Where most entities don't need an includeInactive option they just return the Where(x => x.IsActive)

This will help address future performance issues, but now raises a problem you may have seen, AddYears can't be used inside an EF Linq expression. This is because EF is trying to translate your expression down into SQL, and SQL doesn't understand .AddYears. Fortunately EF has support for dealing with this: EntityFunctions

With an IQueryable<T> repository method and EntityFunctions.AddYears you would have:

var query = _repository.GetAll()
    .Where(x => (int)x.ContractRow.PeriodicityType == (int)CommonConsts.PeriodicityType.Yearly 
        && (EntityFunctions.AddYears(x.ContractRow.Contract.Date, (x.ContractRow.Period.Value/-2)) > x.DueDate
            || EntityFunctions.AddYears(x.ContractRow.Contract.Date, x.ContractRow.Period.Value / 2)) < x.DueDate));

And lastly, the probable cause of your grief: Operations, mixing AND and ORs... (You might have spotted it in the example above)

criteria = A AND B OR C
vs.
criteria = A AND (B OR C)

will produce different results. You need parenthesis around your date range check because without them, you get:

WHERE PeriodicType = Yearly AND Date > 2 years ago
   OR Date < 2 years from future (and PeriodicType can be anything it wants)

A AND B OR C == (A AND B) OR C
you want 
A AND (B OR C)

I might have started with that as the solution, but I really wanted to hopefully get across the potential performance pain point first. ;)

Upvotes: 1

Related Questions