Chris Halcrow
Chris Halcrow

Reputation: 31950

LINQ - Entity Framework Core - SELECT most recent record, only if it matches another condition

I have a dataset that looks like this. I want to select all rows where the most recent record for a node has a status of 'Pending'

Nodeid  Status     Type          Utctimestamp
------  -----      ------        --------   

1       Pending    Transaction   2020-03-13 03:31:00.00000+00
1       Overridden Transaction   2020-03-13 03:32:00.00000+00
2       Unknown    Other         2020-03-13 03:34:00.00000+00
2       Pending    Other         2020-03-13 03:35:00.00000+00

A record for each node should be returned only if the status of the most recent record is Pending (otherwise don't select anything for that node). So for the dataset above, the query will only return the following:

2       Pending    Other         2020-03-13 03:35:00.00000+00

I want to use a LINQ query with Entity Framework, using fluent syntax ('Method syntax'). How do I do that?

Upvotes: 1

Views: 2971

Answers (2)

Chris Halcrow
Chris Halcrow

Reputation: 31950

I achieved it like this:

var pending = myContext.Records
                         .GroupBy(t => t.Nodeid)
                         .Select(grp => grp.OrderByDescending(t => t.Utctimestamp).First())
                         .Where(t => t.Status == "Pending");

Upvotes: 1

Steve Py
Steve Py

Reputation: 34673

If you want the latest node record having the Pending status, and assuming you could have a node that switched to "Pending" more than once: I.e.

node ID  Status      Time
1        Pending     xxxx
1        Submitted   xxxy
1        Pending     xxxz

where you'd want the row for time "xxxz":

With EF6

string pendingStatus = Status.Pending.ToString();
var pending = myContext.TransactionMsg
    .Where(x => x.Status == pendingStatus)
    .GroupBy(x => x.NodeId)
    .Select(g => g.OrderByDescending(x => x.UtcTimestamp).FirstOrDefault())
    .ToList();

Now I wasn't able to get this working with EF Core (Surprise, surprise) so if you're working with Core you'll need to do some digging around to find out how to work around why it cannot seem to do the OrderByDescending inside the Select from the Group expression. One more reason to stick with EF 6 :)

Update: Ok, based on the requirement to get only the Nodes where the latest status is pending.. Getting this in EF6 was relatively simple by moving the Where clause after the Select.

Doing it in EF Core was a whole lot more difficult, but does look to be possible. The big problem is how GroupBy is implemented in EF Core. It's a lot more limited as to how you can interact with the IGrouping results, where EF6 would successfully translate/interact with them as a set, EF Core takes exception to attempting to use OrderBy and such expressions with grouped results. So you need to get a bit more verbose:

var results = context.Nodes
    .GroupBy(x => x.NodeId, (x,y) => new { Timestamp = y.Max(z => z.Timestamp), NodeId = x })
    .Join(context.Nodes, a => new { a.NodeId, a.Timestamp }, b => new { b.NodeId, b.Timestamp }, (a,b) => b)
    .Where(g => g.Status == "Pending")
    .ToList();

This really looks whacked, but a quick walk through: We first group the nodes by Node ID, then from that we select the latest Timestamp using Max against the grouped results to get the biggest Timestamp, and also select the Node ID using the grouped Key. This gives us a structure of { Timestamp (max row), NodeId }. Using this, we Join back onto the Nodes set, using Key comparisons of Node ID and Timestamp to pair up our selected results against the Nodes collection so that we can translate these grouped items back to Node entities. This will return us the collection of nodes that match the Node ID and Max Timestamp, so "Latest Nodes". From that we apply the Where clause to select only the Latest Nodes that are "Pending" status.

Upvotes: 3

Related Questions