Mihai Socaciu
Mihai Socaciu

Reputation: 715

Reusable Get Query using Entity Framework

You have a table Person with 50 columns. Sometimes you want to query 2 columns, other times 10 columns, etc. Then you use Entity Framework selection to fetch from DB only the required columns for query improvement.

PersonRepository:

public PersonDto1 GetForSmallValidation()
{
    return context.Persons.Select(x => 
        new PersonDto1
        {
            Id = x.Id,
            LastName = x.LastName,
        }
    );
}

public PersonDto2 GetActive()
{
    return context.Persons
        .Filter(x => x.Active)
        .Select(x => 
            new PersonDto2
            {
                Id = x.Id,
                LastName = x.LastName,
                FirstName = x.FirstName,
                ... // all fields
            }
        );
}

If some use cases need: an additional column, an .Include(), a filter, then I need a separate Get method and Dto for everyone of those. These methods are hard to reuse and manage. I feel like this is a bad practice and I need some generics or reusable methods. When I query database I should be able to easily specify Includes, specific columns, filters from the calling service.

Question: how do you handle querying only specific columns with EF? Be it with repository pattern or without by calling DbContext in service layer.

My tentative solutions and research findings:

  1. I found in Efficient Querying that I need anonymous C# type:

     return new { column1, column2 }
    

    Problem: do I return dynamic type or how do I change my methods?

     public dynamic GetForSmallValidation() {..}
    
  2. I found in Repository Pattern a generic Get method which will help me write only one Get.
    Problem: how does this integrate with my multiple Dtos (or C# anonymous type)? This Get will have a generic as the database entity Person, and the methods return PersonDto or an anonymous type.

  3. One Dto with nullable on every field is not even considered. I cannot check HasValue for every field.

Upvotes: 0

Views: 79

Answers (2)

Steve Py
Steve Py

Reputation: 34908

Firstly consider why you want to introduce a repository pattern around EF. EF already provides a Repository in the form of DbSet<TEntity>. In most cases there is no need to abstract around that.

Three examples of valid cases to abstract EF's DbSet would include:

  1. Simplifying unit testing. Mocking a DbContext/DbSet can be a pain so introducing a thin repository wrapper around them can make mocking data access simpler using List<TEntity> or MockQueryable if you want to use async operations.

  2. Enforcing core filtering. Repositories can make implementing things like multi-tenant or soft-delete systems cleaner by ensuring that core default filters and things like authorization checks happen consistently rather than scattered in every query expression.

  3. Distributed data storage. If you have a system that can draw data down from relational databases, and/or NoSQL data stores, and/or 3rd party services/APIs etc. and actively switch between these data stores, then you can leverage a repository pattern that works with a lowest-common-denominator DTO structure where EF and its entities are completely abstracted to work interchangeably with other data stores.

If you are not unit testing, do not need to worry about multi-tenancy or soft-delete / auth checks on data access, and do not have a requirement for distributed data storage then I would strongly recommend not adding a repository abstraction over EF. Using the DbContext & DbSets you can easily eager load, project, filter, etc.

A common excuse to introduce an abstraction over EF is if "one day" you want to leave the option open to replace EF with some other ORM etc. Such as if EF is "too slow". This leads to a self-fulfilling prophecy in that adding an abstraction layer such as validly justified by point #3 above, you are introducing pretty significant limitations and costs over just using what EF natively provides. Abstracting "just in case" leads to inefficient, slow, and complex code, justifying worries about EF not being fast enough.

If you have a legitimate requirement like outlined in point #3, then you need to treat your repository like you would an API. The methods and DTOs need to be arranged specifically towards the needs of your consumers. You want to avoid conditional code and form a more disciplined contract between your data domain and consumers. This means extra code/methods to suit, or compromises where consumers have to settle for getting standard data.

If instead you want to introduce a repository to satisfy points #1 or #2 then I recommend a thin repository layer that returns IQueryable<TEntity> rather that IEnumerable<TEntity> or certainly not IEnumerable<TDTO>. To clarify, a DTO is not an entity. DTOs are Data Transfer Objects which would be cases where you want to transfer domain information from one layer to another. Entities serve as your data domain. View Models are a form of DTO, serving as the model for a view.

The benefit of using IQueryable<TEntity> is that your repository layer can stay quite thin and easy to mock, while still centralizing core rules. For example if we want a repository around Customers and are using a soft-delete system with an IsActive flag that most of the time we only want active data, but there are valid use cases to get inactive rows:

public IQueryable<Customer> GetCustomers(bool includeInactive = false)
{
    IQueryable<Customer> query = _context.Customers;
    if (!includeInactve)
        query = query.Where(x => x.IsActive);

    return query;
}

Note: This is an example for core rules. EF does support global query filters which can be used for rules like this, however this applies also to rules that are not possible/practical with global filters, or situations where you may want to query without specific global filters.

The benefit of this approach is that it is easy to mock. Tests can expect the GetCustomers call and pass back a stub set of expected data. Consumers of this method have full control to apply sorting, eager load extra data, project to DTOs/ViewModels using .Select(), paginate, or simply do a .Any() check or .Count() etc. The consumer can also choose between running the query synchronously or asynchronously without us needing to write sync or async flavours of the Get method.

Lastly, Avoid the Generic Repository pattern, whether using an IQueryable thin repository wrapper or a more concrete DTO abstraction. Generics apply to situations where the implementation is identical between types. While many repositories may behave similar to one another, they are not identical. Generic repositories either need to cater to a very bare-bones lowest-common-denominator (offering no flexibility and the worst-case performance) or they introduce a lot of complexity and conditional logic making them a trap for buggy behaviour. 9 times out of 10 they end up with both problems.

Upvotes: 3

David Browne - Microsoft
David Browne - Microsoft

Reputation: 89361

You have just discovered why your repository should have methods that return IQueryable<Person> instead of DTOs. You can write any query you want and project the return into an Entity, Anonymous type or a DTO.

Luckily your DbContext is already such a repository, so you don't need an extra one.

Upvotes: 1

Related Questions