How to return individual columns from EF queries while using repository pattern

My app’s data access layer is comprised of entity framework and the repository pattern*. Using entity framework logging I’ve discovered a handful of queries which are bringing back all columns of an entity (of course it’s obvious that was the case when looking at the ef queries in retrospect). Lots of inefficiency and performance issues.

Trivial to solve this by modifying the query but I don’t know how to pass the results from the repository back to the application.

Passing an IQueryable breaks everything I believe about separations
If I want to pass a strongly typed object (containing just the columns I want) from the repository layer it requires my repository layer to “know” about some other project (via project reference) that contains dto classes and this feels really yucky to me. It creates an assembly dependency that shouldn’t exist.
You can’t return anonymous types

It’s possible that my answer will be to compromise on one of the above. But I wanted to ask the community first.

*I know that using the repository pattern with an ORM is a very contentious topic but I really want to stay away from that debate. I’ve inherited this application so the ship has sailed with respect to application architecture. I just need a practical solution that I can implement alongside my existing architecture with whatever compromises I have to accept.

Upvotes: 3

Answers (2)

Steve Py

Reputation: 34908

Passing an IQueryable breaks everything I believe about separations

I believe this is the root of your issue. :)

The typical argument I see against using IQueryable is that people feel that consumers of the repository should need no knowledge of the schema and returning IQueryable "leaks" EF and schema. I've seen more than enough attempts to abstract away EF and the schema that invariably result in either complex methods accepting expressions/Funcs (which leak EF-isms because the caller still has to pass an expression tree that EF can accept) or extremely inefficient repositories littered with hundreds of methods to cater for every individual consumer.

My counter argument to that assumption is that the role of a repository is merely to abstract the implementation of the data access so that I can unit test the consumers efficiently. By leveraging a repository that returns IQueryable (and only IQueryable) I can mock out a repository to return whatever entity or entities that suit the test to cover the business logic, which does not belong tied to the repository. The advantages to using IQueryable is that you end up with extremely lean repositories, mine generally only have a Create method, Read methods for the entity(ies) that are applicable for the consumer of that repository, and a Delete method in cases where I am using a soft-delete or historical schema. I go so far that my repository may provide a "ById" type method since this is commonly used for updates etc. but this method too returns IQueryable. If I'm doing an update, the consumer calls .Single() after including any related entities that might be updated as well. Other code may simply do an .Any check, or just want to select a few values from that entity for some purpose. The pattern is adaptable.

By leveraging IQueryable you have no issue with wanting to efficiently select a few fields from an entity or it's hierarchy. It is simple to use, simple to understand, and leads to efficient, fast querying. Trusted internal callers can leverage Linq methods to use .Any, .Count, paging, sorting, filtering etc. while Selecting just the data they need to suit their specific requirements. The repository handles core-level filtering rules such as authorization, soft-delete is-active checks, etc. It can be abused and result in ugly, expensive hits to the DB, but so too can any code you write. Complex solutions are harder to understand, and when they don't seem to fit future requirements, they lead to hacks or modifications that break something. Repositories that have dozens of purpose-built methods result in considerable duplication as developers don't bother sifting through the growing # of methods when they need 1 extra column or one less column. What's worse in my opinion is where these methods return an entity as a container that is only partially filled out. An entity should always represent a true, complete data state because any method that accepts an entity should not have to doubt how complete that entity actually is.

When faced with writing code that will be touched by others I focus on making it easy to understand, and easy to find abuses and correct them. I believe that good code & architecture should make mistakes easy to spot and easy to fix, rather than trying to make mistakes hard to make.

Upvotes: 2

David Browne - Microsoft

Reputation: 89386

Passing an IQueryable breaks everything I believe about separations

This is the right answer. A repository shouldn't have to know about every possible query the application might need. To put all query logic inside the repository is actually what breaks the separation of duties between the repository and the other parts of the application.

to pass a strongly typed object ... from the repository layer it requires my repository layer to “know” about some other project

The DTOs are a part of the "contract" between the application components. It's is necessary and appropriate for all application components to have a shared reference to the service contracts, both .NET interface definitions and data types passed to and from the interface methods.

Upvotes: 2

How to return individual columns from EF queries while using repository pattern

Answers (2)

Related Questions