Reputation: 1637
I have a linq query that returns the last page a user looked at based on a table of page hits. The fields are simply TimeStamp
, UserID
and URL
which are logged from user activity. The query looks like this:
public static IQueryable GetUserStatus()
{
var ctx = new AppEntities();
var currentPageHits = ctx.Pagehits
.GroupBy(x => x.UserID)
.Select(x => x.Where(y => y.TimeStamp == x.Max(z => z.TimeStamp)))
.SelectMany(x => x);
return currentPageHits.OrderByDescending(o => o.TimeStamp);
}
The query works perfectly but runs slowly. Our DBA assures us that the table has indexes in all the right places and that the trouble must be with the query.
Is there anything inherently wrong or BAD with this, or is there a more efficient way of getting the same results?
Upvotes: 0
Views: 70
Reputation: 460048
So you try to implement DENSE_RANK() OVER (PARTITION BY UserID ORDER BY TimeStamp DESC)
with LINQ? So all latest records per user-group according to the Timestamp
. You could try:
public static IQueryable GetUserStatus()
{
var ctx = new AppEntities();
var currentPageHits = ctx.Pagehits
.GroupBy(x => x.UserID)
.SelectMany(x => x.GroupBy(y => y.TimeStamp).OrderByDescending(g=> g.Key).FirstOrDefault())
.OrderByDescending(x => x.TimeStamp);
return currentPageHits;
}
So it's grouping the user-group by TimeStamp
, then it takes the latest group(one or more records in case of ties). The SelectMany
flattens the goups to records. I think this is more efficient than your query.
Upvotes: 2
Reputation: 111830
You could try:
var currentPageHits2 = ctx.Pagehits
.GroupBy(x => x.UserID)
.Select(x => x.OrderByDescending(y => y.TimeStamp).First())
.OrderByDescending(x => x.TimeStamp);
But the speed should be the same.
Note that there is a subtle difference between this query and yours... With yours, if a UserId has two "max TimeStamp" PageHits with the same TimeStamp, two "rows" will be returned, with this one only one will be returned.
Upvotes: 3