plasmy
plasmy

Reputation: 149

LINQ: How do I find records in a table where the ID is in a List I queried before?

Suppose I have table A

ID1 Text
1 aaa
2 bbb

And I have table B

ID2 ID1 Text
1 1 aaaa
2 1 bbbb
3 2 dddd
4 2 eeee

And I have Table C

ID3 ID2 Text
1 1 aaaa
2 1 bbbb
3 2 cccc
4 2 dddd
5 3 eeee
6 3 ffff
7 4 gggg
8 4 hhhh

I want to delete a record from Table A but I need to delete the dependencies from Table B and C first. As you can see, Table C only has a key from Table B. I haven't done much Linq before but here's what I have so far:

public async Task DeleteRecord(Record record)
{
  await using var transaction = await _context.Database.BeginTransactionAsync();
  try
  {
    var aRecord = await _context.B
      .Where(e => e.ID1 == record.ID1)
      .ToListAsync();
    if(aRecord != null)
    {
      // In here I need to delete the records in C that contain all of the ID2's in aRecord
      _context.B.RemoveRange(aRecord);
    }
    _context.A.Remove(record);
    await transaction.CommitAsync();
  }
  catch (Exception e)
  {
    throw new Exception(e.Message, e);
  }
}

Upvotes: 0

Views: 748

Answers (2)

Harald Coppoolse
Harald Coppoolse

Reputation: 30512

Unlike in SQL, in Entity Framework you can only Remove objects that you have fetched before. In SQL you can say: "Please remove all New York Customers". In Entity Framework you first have to fetch all New York Customers before you can remove them.

Direct answer to your question

Not optimal! See later

So you have three tables, with one-to-many relations between them. Your tables are a bit abstract, so to make it easier to understand, I'll change it to tables Customers where every Customer has zero or more Orders, and every Order has zero or more OrderLines. Every OrderLine is an OrderLine in an Order, using foreign key, and every Order is an Order of a Customer, also using a foreign key.

Apparently you want to remove a Customer (one item from Table A). You state that before you can remove this Customer, you'll have to remove all his Orders and OrderLines.

Later I'll show you that if you use cascading, that this is not necessary. But first I'll show you the answer to your question.

Before you can remove an Item, you have to fetch it completely. So to remove a Customer, you'll have to fetch the Customer, all his Orders, and all OrderLines from his Orders.

In your case: the item from Table A, all items that refer to this Item from Table B and all Table C items that refer to the items from Table B (luckily the Customer - Orders database is easier to describe).

int customerId = 54;
using (var dbContext = new OrderDbContext())
{
    var customerToRemove = dbContext.Customers.Find(customerId);
    // or use a Where(...).FirstOrDefault();

    // fetch all Orders that have a foreign key referring to the Customer:
    var ordersToRemove = dbContext.Orders
        .Where(order => order.CustomerId == customerId)
        .ToList();

    // fetch all OrderLines (table C) that refer to one of the Ids of the Orders that
    // must be removed
    // Orders are already local, so this is not a database action:
    var orderIdsToRemove = ordersToRemove.Select(order => order.Id).ToList();

    var orderLinesToRemove = dbContext.OrderLines
        .Where(orderLine => orderIdsToRemove.Contains(orderLine.Id))
        .ToList();

In words:

  • Fetch the Customer that has a primary key equal to customerId;
  • Fetch all Orders that have a foreign key equal to customerId;
  • Optimization: extract all primary keys from all the fetched Order.
  • Fetch all OrderLines that have a foreign key that is in the extracted primary keys.

Now to remove all items:

dbContext.OrderLines.RemoveRange(orderLinesToRemove);
dbContext.Orders.RemoveRange(ordersToRemove);
dbContext.Customer.Remove(customerToRemove);

SaveChanges and the ChangeTracker

Note: until now, the database has not been changed. The DbContext has a ChangeTracker. Whenever you fetch a complete item from the database (= use Find, or fetch without Select), the item is put in the ChangeTracker, as well as a copy. You get a reference to the original item in the ChangeTracker. So whenever you change the value of a property of one of the fetched item, it is changed in the ChangeTracker.

Items that are Added, are only added to the ChangeTracker, with a status: Added. Items that are Removed are already in the ChangeTracker. They get a status: Removed.

So all property changes, all Add / Removes are done without the database. They are all done in the ChangeTracker.

Saving all changes is done via:

dbContext.SaveChanges();

This will check the ChangeTracker to see which items are Added / Removed. From all other items, it will check which properties are changed by comparing the original and the copy. All these changes are translated into one big SQL command, which is executed.

So now you know how to remove the items the way you proposed.

There is room for improvement

Transaction: Because all changes are executed in SaveChanges, you don't need to define a transaction while fetching the items. I'm pretty sure that because the changes are executed in one DbCommand, you won't even need it at all, unless you will be doing several SaveChanges.

Side remark: Always use Select if you don't plan to change the fetched data, and select only the properties that you actually plan to use. By using Select the data won't be put in the ChangeTracker, nor a copy. When saving changes, fewer items need to be compared.

This will be a huge optimization if you would fetch 100 Orders from a Customer, each with several OrderLines: all 100 Orders, and all several 100 OrderLines would have to be put in the ChangeTracker as well as a copy. Besides: every Order of Customer[42] has a foreign key with a value of 42. It is a waste to fetch this value 100 times from the database.

Optimization: Use the virtual ICollection to fetch all data at once

If you've followed the entity framework conventions, you'll have classes similar to the following:

class Customer
{
    public int Id {get; set;}
    public string Name {get; set;}
    ...

    // Every Customer has zero or more Orders (one-to-many)
    public virtual ICollection<Order> Orders {get; set;}
}

class Order
{
    public int Id {get; set;}
    public int OrderNumber {get; set;}
    public DateTime OrderDate {get; set;}
    ...

    // Every Order is the Order of exactly one Customer, using foreign key:
    public int CustomerId {get; set;}
    public virtual Customer Customer {get; set;}

    // Every Customer has zero or more Orders (one-to-many)
    public virtual ICollection<Order> Orders {get; set;}
}

And the OrderLine:

class OrderLine
{
    public int Id {get; set;}
    ...

    // Every OrderLine is an OrderLine in exactly one Order, using foreign key:
    public int OrderId {get; set;}
    public virtual Order Order {get; set;}
}

In entity framework the columns in the tables are represented by non-virtual properties. The virtual properties represent the relations between the tables (one-to-many, many-to-many)

Now to fetch everything in one go, use Include

int customerId = 42;
var customerToRemove = dbContext.Customers
    .Where(customer => customer.Id == customerId)
    .Include(customer => customer.Order.OrderLines)
    .ToList();

Now if you want, you can remove all OrderLines, then all Orders, and finally the Customer

Best optimization: use CascadeOnDelete

Most databases have the possibility to automatically delete dependent items: if you remove the Customer, all its Orders are removed, and because OrderLines are dependent on Orders, all their OrderLines are also removed. Code will be like:

int customerId = 54;
using (var dbContext = new OrderDbContext())
{
    Customer customerToRemove = dbContext.Customers.Find(customerId);
    dbContext.Customers.Remove(customerToRemove);
    dbContext.SaveChanges();
}

Because CascadeOnDelete is used, you don't have to fetch all Orders and OrderLines to your local process. The database will automatically remove them when you remove the Customer.

If you've followed entity framework conventions, then this behaviour will be standard. You can also force it using fluent API in DbContext.OnModelCreating

// Every Customer has zero or more Orders (one-to-many)
modelBuilder.Entity<Customer>()
    .HasMany(customer => customer.Orders)
    .WithRequired(order => order.Customer)
    .HasForeignKey(order => order.CustomerId)
    .WillCascadeOnDelete(true);

In words: you can find the Orders of a Customer in property Orders. Every Order is the Order of exactly one Customer. The Customer of this Order can be found in property Customer. In the database this is done using foreign key CustomerId. And finally: if you remove the Customer, all Orders of this Customer must also be removed.

My advice would be: whenever you have one-to-many relations, switch cascade on delete on. It will makes removes so much easier!

Upvotes: 0

Steve Py
Steve Py

Reputation: 35083

Given a list of A's, provided I need to manually delete all B's and C's from a selection of A's that I want to delete without cascade delete capabilities, and ideally without loading all of that data.. I'll want to ensure that my A B and C entities are set up with navigation properties, so A will have an ICollection<B> and B will have an ICollection<C>. If the current DbContext and entity definitions do not cater for this, you can create a new bounded DbContext for this operation with new, bare-bones entity definitions. Since this would just be used for issuing Delete operations you would just need to declare the entities with their PK columns and the virtual collections. It can work with full entities if you have those set up with the navigation properties already, it's just a matter of overhead to load/store all of the columns in each rather than just the IDs.

For example:

public class A
{
    [Key]
    public int Id { get; set; }

    public virtual ICollection<B> Bs { get; set; } = new List<B>();
}

public class B
{
    [Key]
    public int Id { get; set; }

    public virtual ICollection<C> Cs { get; set; } = new List<C>();
}

public class C
{
    [Key]
    public int Id { get; set; }
}

// EF6 example
public class DeleteBoundedDbContext : DbContext
{
    public DbSet<A> As { get; set; }
    public DbSet<B> Bs { get; set; }
    public DbSet<C> Cs { get; set; }

    public DeleteBoundedDbContext()
        : base ("AppDbContext")  // Use the name of your main application DbContext to use the same connection string...
    { }

    protected override void OnModelCreating(DbModelBuilder modelBuilder)
    {
        modelBuilder.Entity<A>()
            .HasMany(a => a.Bs)
            .WithRequired()
            .Map(a => a.MapKey("AId")); // FK on B pointing to it's A
        modelBuilder.Entity<B>()
            .HasMany(b => b.Cs)
            .WithRequired()
            .Map(b => b.MapKey("BId")); // FK on C pointing to it's B
    }
}

Then to get our data:

var dataToDelete = context.As
    .Where(/* insert criteria */)
    .Select(a => new 
    {
        A = a,
        Bs = a.Bs,
        Cs = a.Bs.SelectMany(b => b.Cs).ToList()
    ).ToList();

foreach(var a in dataToDelete)
{
   context.Cs.RemoveRange(a.Cs);
   context.Bs.RemoveRange(a.Bs);
   context.As.Remove(a.A);
}
context.SaveChanges();

If the # of entities is expected to be potentially large (over 1000) then you would want to consider taking it in stages using Take to process and commit a few A's at a time.

Using a separate DbContext with bare minimal entity definitions for the related items in question means you can load these entity references quickly and with minimal memory use. We select the "A"s, along with each of their sets of "B"s, and then with that, all of the "C"s associated through all of those B's. From there, with DbSets defined for each we can use RemoveRange to tell EF to delete them. All of the C's followed by B's, followed by each associated A.

If using the separate DbContext approach and are still querying data from the main DbContext you will need to drop and recreate the main application DbContext. Ideally this would be an operation in and of itself on a given request.

Upvotes: 1

Related Questions