xartal
xartal

Reputation: 457

bad performance when contains all query in entity framework

I'm currently learning Entity Framework and am trying to do a small research project. It works but the performance is really bad and i'm hoping that someone can show me some improvements i could do to speed things a little up.

There are the following Classes:

public class Tag
{
    public string Id{get;set}
    public string Tag{
    get{ return Id;}
}

public class AnEntity
{
    [...]
    public virtual ICollection<Tag> Tags{ get; set;}
}

I query the database this way:

var query = from entity in db.AnEntities
    where [...]
    select new {
        [...]
        Tags = entity.Tags.Select(tag => tag.Id).ToList()
    }

If I execute the query now with await query.ToListAsync() the performance is quite reasonable. However if i want to filter the search and return only the AnEntities that cointains all of a given set of tags, the performance is really bad. I do the "contains all query" this way:

var filters = new List<string>(){"Filter1", "Filter2"};
from filtent in query
    where filters.Intersect(filtent.Tags).Count() == filters.Count
    select filtent;

Is there a cleverer way to do this 'contains all' query?

Edit: I also thought about replace the Tags with a comma-seperated string but I'm not sure how i can run a 'contains all' query directly on the database. This is important, because after the filterquery there are some more queries and i dont want to receive entities from the database that dont contains all the given filters. Is there a way i could do it with a comma seperated string or something like that?

Many thanks for hints!

Upvotes: 1

Views: 1153

Answers (1)

Gert Arnold
Gert Arnold

Reputation: 109255

This is because EF creates a monstrous query with Intersect. It somehow has to translate the list filters into SQL and it does that by UNION-ing single-value SELECTs for each element in the list. And that happens a number of times in one query.

I've experienced that statements like Intersect, Except and sometimes All and Any are often better avoided and replaced by a solution using Contains, because Contains is always translated into an IN statement. In your case that would be

from filtent in query
where filtent.Tags.All(f => filters.Contains(f) )
select filtent

I'm sure this will produce a much better query that also performs much better. In this case, the All statement is translated into a harmless NOT EXISTS.

Upvotes: 3

Related Questions