bzmind
bzmind

Reputation: 458

Is this approach good for managing the DbContext?

I have a repository that asks for a DbContext in its constructor, and then I used ninject to solve this dependency, and I set the object scope to be InRequestScope as it means instantiating an object per HTTP request, but I'm not sure that when an HTTP request actually happens? is it when the app is being loaded? or it happens when we call SaveChanges()? My approach for managing the DbContext is like that, I have a repository asking for a context as I said, and then the controller asks for this repository in its constructor:

public class PageGroupsController : Controller
{
    IGenericRepository<PageGroup> _repository;
    public PageGroupsController(IGenericRepository<PageGroup> repository)
    {
        _repository = repository;
    }

    // GET: Admin/PageGroups
    public ActionResult Index()
    {
        return View(_repository.Get());
    }
}

And the repository:

public class GenericRepository<TEntity> : IGenericRepository<TEntity> where TEntity : class
{
    private DbContext _context;
    public GenericRepository(DbContext context)
    {
        _context = context;
    }

    public IEnumerable<TEntity> Get()
    {
        return _context.Set<TEntity>().ToList();
    }
}

And the NinjectWebCommon.cs which is where I solve the dependencies:

private static void RegisterServices(IKernel kernel)
{
    kernel.Bind<DbContext>().To<MyCmsContext>().InRequestScope();
    kernel.Bind<IGenericRepository<PageGroup>>().To<GenericRepository<PageGroup>>();
}

Is this approach good at all? I didn't want to use using {var db = new DbContext} all over the place in my controllers, and I didn't want to make a single context for the whole app as well. is this approach equal to the using approach(I mean querying what we need in a using block)? but with less coupling?

Upvotes: 0

Views: 125

Answers (1)

Steve Py
Steve Py

Reputation: 35063

Each time a controller action is called from any web client, that is a request. So when someone visits your site and visits /Pagegroups/Index resolved through routing, that is a request. When you do a Form.Submit from the client, that is a request, make an Ajax call, that is a request.

Do you want the DbContext scoped to be constructed for each request? Absolutely, and no "longer" than a request. For simple applications, using using() within actions is perfectly fine, but it does add a bit of boilerplate code repeating it everywhere. In more complex, long lived applications where you might want to unit test or that could have more complex logic that benefits from breaking down into smaller components shared around, using blocks are a bit of a mess to share the DbContext, so an injected DbContext scoped to the request serves that purpose just fine. Every class instance serving a request is given the exact same DbContext instance.

You don't want a DbContext scoped longer than a request (I.e. Singleton) because while requests from one client may be sequential, requests from multiple users are not. Web servers will respond to various user requests at a time on different threads. EF's DbContext is not thread safe. This catches out new developers where everything seems to work on their machine when testing, only to find that once deployed to a server and handling concurrent requests, errors start popping up.

Also, as DbContext's age, they get bigger and slower tracking more instances of entities. This leads to gradual performance loss, as well as issues as a DbContext serves up cached instances that doesn't reflect data changes from possibly other sources. A new development team might get caught out with the cross-thread issue but introduce locking or such because they want to use EF's caching rather than using a shorter lifespan. (assuming DbContext are "expensive" to create all the time [they're not!:]) This often is the cause of teams calling to abandon EF because it's "slow" without realizing that design decisions prevented them from taking advantage of most of EF's capabilities.

As a general tip I would strongly recommend avoiding the Generic Repository pattern when working with EF. It will give you no benefit other than pigeon-holing your data logic. The power of EF is in the ability to handle the translation of operations against Objects and their relationships down to SQL. It is not merely a wrapper to get down to data. Methods like this:

public IEnumerable<TEntity> Get()
{
    return _context.Set<TEntity>().ToList();
}

are entirely counter-productive. If you have tens of thousands of records want to order and paginate, and do something like:

var items = repository.Get()
    .OrderBy(x => x.CreatedAt)
    .Skip(pageNumber * pageSize)
    .Take(pageSize)
    .ToList();

The problem is that your repository tells EF to load, track, and materialize the entire table before any sorting or pagination take place. What's worse is that if there was any filtering to be done (Where clauses based on search criteria etc.) then these wouldn't be applied until the Repository had returned all of the records.

Instead, if you just had your controller method do this:

var items = _context.PageGroups
    .OrderBy(x => x.CreatedAt)
    .Skip(pageNumber * pageSize)
    .Take(pageSize)
    .ToList();

then EF would compose an SQL query that performed the ordering and fetched just that single page of entities. The same goes for taking advantage of Projection with Select to fetch back just the details you need, or eager loading related entities. Trying to do that with a generic repository gets either very complex (trying to pass expressions around, or lots of arguments to try and handle sorting, pagination, etc.) or very inefficient, often both.

Two reasons I recommend considering a repository are: Unit testing, and to handle low-level common filtering such as soft-delete (IsActive) and/or multi-tenancy (OwnerId) type data. Basically any time that the data generally has to conform to standard rules that a repository can enforce in one place. In these cases I recommend non-generic repositories that serve respective controllers. For instance, if I have a ManagePageGroupsController, I'd have a ManagePageGroupsRepository to serve it. The key difference in this pattern is that the Repository returns IQueryable<TEntity> rather than IEnumerable<TEntity> or even TEntity. (Unless the result of a "Create" method) This allows the consumers to still handle sorting, pagination, projection, etc. as if they were working with the DbContext, while the repository can ensure Where clauses are in place for low-level rules, assert access rights, and the repository can be mocked out easily as a substitute for unit tests. (Easier to mock a repository method that serves an IQueryable than to mock a DbContext/DbSet) Unless your application is going to be using unit tests, or has a few low-level common considerations like soft-deletes, I'd recommend not bothering with the complexity of trying to abstract the DbContext and fully leverage everything EF has to offer.

Edit: Expanding on IQueryable

Once you determine that a Repository serves a use for testing or base filtering like IsActive, you can avoid a lot of complexity by returning IQueryable rather than IEnumerable.

Consumers of a repository will often want to do things like filter results, sort results, paginate results, project results to DTOs / ViewModels, or otherwise use the results to perform checks like getting a count or checking if any items exist.

As covered above, a method like:

public IEnumerable<PageGroup> Get()
{
    return _context.PageGroups
        .Where(x => x.IsActive)
        .ToList();
}

would return ALL items from the database to be stored in memory by the application server before any of these considerations were taken. If we want to support filtering:

public IEnumerable<PageGroup> Get(PageGroupFilters filters)
{
    var query _context.PageGroups
        .Where(x => x.IsActive);

    if (!string.IsNullOrEmpty(filters.Name)
        query = query.Where(x => x.Name.StartsWith(filters.Name));
    // Repeat for any other supported filters.

    return query.ToList();
}

Then adding order by conditions:

public IEnumerable<PageGroup> Get(PageGroupFilters filters, IEnumerable<OrderByCondition> orderBy)
{
    var query _context.PageGroups
        .Where(x => x.IsActive);

    if (!string.IsNullOrEmpty(filters.Name)
        query = query.Where(x => x.Name.StartsWith(filters.Name));
    // Repeat for any other supported filters.

    foreach(var condition in orderBy)
    {
        if (condition.Direction == Directions.Ascending)
           query = query.OrderBy(condition.Expression);
        else
           query = query.OrderByDescending(condition.Expression);
    }
    return query.ToList();
}

then pagination: public IEnumerable Get(PageGroupFilters filters, IEnumerable orderBy, int pageNumber = 1, int pageSize = 0) { var query _context.PageGroups .Where(x => x.IsActive);

    if (!string.IsNullOrEmpty(filters.Name)
        query = query.Where(x => x.Name.StartsWith(filters.Name));
    // Repeat for any other supported filters.

    foreach(var condition in orderBy)
    {
        if (condition.Direction == Directions.Ascending)
           query = query.OrderBy(condition.Expression);
        else
           query = query.OrderByDescending(condition.Expression);
    }

    if (pageSize != 0)
        query = query.Skip(pageNumber * pageSize).Take(pageSize);
        

    return query.ToList();
}

You can hopefully see where this is going. You may just want a count of applicable entities, or check if at least one exists. As above this will still always return the list of Entities. If we have related entities that might need to be eager loaded, or projected down to a DTO/ViewModel, still much more work to be done or a memory/performance hit to accept.

Alternatively you can add multiple methods to handle scenarios for filtering (GetAll vs. GetBySource, etc.) and pass Func<Expression<T>> as parameters to try and generalize the implementation. This adds considerable complexity or leaves gaps in what is available for consumers. Often the justification for the Repository pattern is to abstract the data logic (ORM) from the business logic. However this either cripples your performance and/or capability of your system, or it is a lie the minute you introduce Expressions through the abstraction. Any expression passed to the repository and fed to EF must conform to EF's rules (No custom functions, or system methods that EF cannot translate to SQL, etc.) or you must add considerable complexity to parse and translate expressions within your Repository to ensure everything will work. And then on top of that, supporting synchronous vs. asynchronous.. It adds up fast.

The alternative is IQueryable:

public IQueryable<PageGroup> Get()
{
    return _context.PageGroups
        .Where(x => x.IsActive);
}

Now when a consumer wants to add filtering, sorting, and pagination:

var pageGroups = Repository.Get()
    .Where(x => x.Name.StartsWith(searchText)
    .OrderBy(x => x.Name)
    .Skip(pageNumber * pageSize).Take(pageSize)
    .ToList();

if they want to simply get a count:

var pageGroups = Repository.Get()
    .Where(x => x.Name.StartsWith(searchText)
    .Count();

If we are dealing with a more complex entity like a Customer with Orders and OrderLines, we can eager load or project:

// Top 50 customers by order count.
var customer = ManageCustomerRepository.Get()
    .Select(x => new CustomerSummaryViewModel
    {
        CustomerId = x.Id,
        Name = x.Name,
        OrderCount = x.Orders.Count()
    }).OrderByDescending(x => x.Orders.Count())
    .Take(50)
    .ToList(); 

Even if I commonly fetch items by ID and want a repository method like "GetById" I will return IQueryable<T> rather than T:

public IQueryable<PageGroup> GetById(pageGroupid)
{
    return _context.PageGroups
        .Where(x => x.PageGroupId == pageGroupId);
    // rather than returning a PageGroup and using
    // return _context.PageGroups.SingleOrDefault(x =>x.PageGroupId == pageGroupid);
}

Why? Because my caller can still take advantage of projecting the item down to a view model, decide if anything needs to be eager loaded, or do an action like an exists check using Any().

The Repository does not abstract the DbContext to hide EF from the business logic, but rather to enable a base set of rules like the check for IsActive so we don't have to worry about adding .Where(x => x.IsActive) everywhere and the consequences of forgetting it. It's also easy to mock out. For instance to create a mock of our repository's Get method:

var mockRepository = new Mock<PageGroupRepository>();
mockRepository.Setup(x => x.Get())
    .Returns(buildSamplePageGroups());

where the buildSamplePageGroups method holds code that builds the set of test data suitable for the test. That method returns a List<PageGroup> containing the test data. This only gets a bit more complex from a testing perspective if you need to support async operations against the repository. This requires a suitable container for the test data rather than List<T>.

Edit 2: Generic Repositories.

The issue with Generic repositories is that you end up compartmentalizing your entities where through details like navigation properties, they are related. In creating an order you deal with customers, addresses, orders, products etc. where the act of creating an order generally only needs a subset of information about these entities. If I have a ManageOrdersController to handle editing and creating orders and generic repositories, I end up with dependencies on several repositories for Order, Customer, Product, etc. etc.

The typical argument for generic repositories is Single Reponsibility Principle (SRP) and Do Not Repeat Yourself (DNRY/DRY) An OrderRepository is responsible for only orders, CustomerRepository is responsible for only customers. However, you could equally argue organizing the repository this way breaks SRP because the principle behind SRP is that the code within should have one, and only one reason to change. Especially without an IQueryable implementation, a repository referenced exposing methods that are used by several different controllers and related services has the potential for many reasons to change as each controller has different concerns for the actions and output of the repository. DRY is a different argument and comes down to preference. The key to DRY is that it should be considered where code is identical, not merely similar. With an IQueryable implementation there is a valid argument that you could easily have identical methods in multiple repositories, I.e. GetProducts in a ManageOrderRepository and ManageProductsRepository vs. centralizing it in a ProductsRepository referenced by both ManageOrderController and ManageProductController. However, the implementation of GetProducts is fairly dead simple, amounting to nearly a one-liner. A GetProducts method for a Product-related controller may be interested on getting products that are active vs. inactive, where getting products to complete an order would likely only ever look at active products. It boils down to a decision if trying to satisfy DRY is worth having to manage references to a handful (or more) repository dependencies vs. a single repository. (Considering things like mock setups for tests) Generic repositories specifically expect all methods across every entity type to conform to a specific pattern. Generics are great where that implementation is identical, but fails at that goal the minute the code could benefit from being allowed to be "similar" but serve a unique variation.

Instead, I opt to pair my repository to the controller, having a ManageOrdersRepository. This repository and the methods within have only one reason to ever change, and that is to serve the ManageOrdersController. While other repositories may have similar needs from some of the entities this repository does, they are free to change to serve the needs of their controller without impacting the Manage Orders process flow. This keeps constructor dependencies compact and easy to mock.

Upvotes: 2

Related Questions