Sebastián A
Sebastián A

Reputation: 880

Linq performance between

I am using Entity Framework and I want to know if there are any difference between the following code. The first and second code seems to get all the items an them filter by Entidad.Nombre == "Empresa", the last code seems to get only the items where Entidad.Nombre == "Empresa". Am I wrong? What is better?

var listFields = from b in unitOfWork.PropiedadRepository.Get()
                         where b.Entidad.Nombre == "Empresa"
                         select b;
var listFields2 = unitOfWork.PropiedadRepository.Get().Where(x => x.Entidad.Nombre == "Empresa");
var listFields3 = unitOfWork.PropiedadRepository.Get(x => x.Entidad.Nombre == "Empresa");

Here is the GenericRepository class. All the repositories inherit of that.

public sealed class GenericRepository<TEntity> where TEntity : class
{
    private readonly ConfigurationDbDataContext _context;
    private readonly DbSet<TEntity> _dbSet;

    public GenericRepository(ConfigurationDbDataContext context)
    {
        _context = context;
        _dbSet = context.Set<TEntity>();
    }

    public IEnumerable<TEntity> Get(
        Expression<Func<TEntity, bool>> filter = null,
        Func<IQueryable<TEntity>, IOrderedQueryable<TEntity>> orderBy = null,
        string includeProperties = "")
    {
        IQueryable<TEntity> query = _dbSet;

        if (filter != null)
        {
            query = query.Where(filter);
        }

        query = includeProperties.Split(new[] {','}, StringSplitOptions.RemoveEmptyEntries).Aggregate(query, (current, includeProperty) => current.Include(includeProperty));

        // ReSharper disable once ConvertIfStatementToReturnStatement
        if (orderBy != null)
        {
            return orderBy(query).ToList();
        }
        else
        {
            return query.ToList();
        }
    }

    public TEntity GetById(object id)
    {
        return _dbSet.Find(id);
    }

    public void Insert(TEntity entity)
    {
        _dbSet.Add(entity);
    }

    public void Delete(object id)
    {
        var entityToDelete = _dbSet.Find(id);
        Delete(entityToDelete);
    }

    public void Delete(TEntity entityToDelete)
    {
        if (_context.Entry(entityToDelete).State == EntityState.Detached)
        {
            _dbSet.Attach(entityToDelete);
        }
        _dbSet.Remove(entityToDelete);
    }

    public void Update(TEntity entityToUpdate)
    {
        _dbSet.Attach(entityToUpdate);
        _context.Entry(entityToUpdate).State = EntityState.Modified;
    }
}

Upvotes: 3

Views: 660

Answers (3)

DVK
DVK

Reputation: 2792

var listFields3 = unitOfWork.PropiedadRepository
                            .Get(x => x.Entidad.Nombre == "Empresa");

The last case is better. Entity Framework defers execution of the SQL query until you do something to materialize the result set somehow. It looks at the entire expression, and then formulates the query based upon that.

In your Get() method you call ToList(), which does materialize the result set. Therefore if you call .Get().Where(...), then first Get() is called with no filter, and because Get() calls ToList(), it is pulling back all of the data into memory and then executing your Where() against that.

In the last case, Where() is called within your Get() method PRIOR to ToList() being called. This means that the where clause will be used to construct a SQL query that filters the data on the SQL Server instead of in-memory on your application server, and will be much more efficient.

Something to think about here would be whether you wanted to have your repository return an IQueryable<T> instead of IEnumerable<T>. If you return an IQueryable, then again, the statement will not be executed until your result set needs to be materialized. If Get() was returning IQueryable instead of calling ToList(), then both statements would execute the same.

Upvotes: 3

tschmit007
tschmit007

Reputation: 7800

You are right.

In the last case the evaluation of the where clause is given to the dataserver (linq to entities) and the client receive only the filtered data.

In the other cases the client receive ALL the data and then filter it (linq to object).

That said: of what performance are we talking ? CPU, network, client side, server side ?

Usually, one prefer to let the server filter so linq to entities, but it really depends on some parameters and goal.

Be aware of the, imho, most surprising difference between linq to entities and linq to object: linq to object is case sensitive. For linq to entities it depends on database/table/column collation.

Also, as reminded by Kaspars, the handling of trailing whitespaces is different.

Upvotes: 5

Kaspars Ozols
Kaspars Ozols

Reputation: 7017

As others already said, third query will be the fastest as filtering is done on DB level. Only relevant records are transferred to client.

Although main question is already answered, I would like to propose a change in base repository to make it more elastic. Currently you pass all the possible filtering options to the repository Get function and apply them there. This seems to be very limiting and unnecessary.

If at some point you would like to perform additional operations (for example, GroupBy) at DB level, you would be unable to do so unless you add additional parameter to Get function.

I propose to change Get function into GetAll and return IQueryable instead. You could add default filtering and ordering rules there.

public IQueryable<TEntity> GetAll()
{
    return _dbSet
        .Where(x => !x.IsDeleted)
        .OrderBy(x => x.Date);
}

By doing that all your queries would always perform filtering/ordering/aggregating at DB level. Also, this would be more elastic as you could use all the functions that Linq to Entities offers.

Upvotes: 3

Related Questions