RebelScum
RebelScum

Reputation: 719

How do I use grouping, distinct, and count together in linq?

I have an ActivityLog table with a row for each hit on each page in a web app. The table has the following pertinent fields: PageTitle, UserName, ActivityDate. I'd like to add a Usage History page with a GridView that has the following columns: Page Title, # Hits, # Unique Users. So for each page in the app, we would show the total number of hits, and the number of unique users hitting that page.

I have tried the following linq, which, from what I could gather in my searches, should work:

var ual = (from activityLog in linqMetaData.UserActivityLog
           group activityLog by activityLog.PageTitle into pageGroup
           select new PageUsageStatistics()
           {
               PageTitle = pageGroup.Key,
               NumHits = pageGroup.Count(),
               NumUniqueUsers = pageGroup.Select(x => x.UserName).Distinct().Count()
           });

The NumHits comes back with the expected number; however, NumUniqueUsers is coming back with the number of unique total users that have hits, not the count for each page. So if I have 3 users, each having 1 hit on their own distinct page (User1 hits Page1, User2 hits Page2, and User3 hits Page3), all three rows in my table are showing 3 for the NumUniqueUsers column, even though they should show 1.

Any suggestions?

Thanks, Chris

EDIT - Adding generated SQL:

SELECT [LPA_L1].[PageName], 
       [LPA_L1].[NumHits], 
       [LPA_L1].[NumUniqueUsers] 
FROM 
    (SELECT [LPA_L2].[PageTitle] AS [PageName], 
            [LPA_L2].[LPAV_] AS [NumHits], 
            (SELECT COUNT(*) AS [LPAV_] 
             FROM 
                 (SELECT DISTINCT [LPA_L2].[UserPrincipleName] 
                  FROM [USIC].[dbo].[UserActivityLog]  [LPA_L2]  
                 ) [LPA_L3]) AS [NumUniqueUsers] 
     FROM 
         (SELECT [LPLA_1].[PageTitle], 
                 COUNT(*) AS [LPAV_] 
          FROM [USIC].[dbo].[UserActivityLog]  [LPLA_1]   
          GROUP BY [LPLA_1].[PageTitle]
         ) [LPA_L2]
    ) [LPA_L1] 
ORDER BY [LPA_L1].[PageName] ASC

Upvotes: 2

Views: 737

Answers (3)

James Johnson
James Johnson

Reputation: 46047

Try adding this extension method:

public static IEnumerable<TSource> DistinctBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
    HashSet<TKey> knownKeys = new HashSet<TKey>();
    foreach (TSource element in source)
    {
        if (knownKeys.Add(keySelector(element)))
        {
            yield return element;
        }
    }
}

And use it like this:

NumUniqueUsers = pageGroup.DistinctBy(x => x.UserName).Count();

Upvotes: 0

Amy B
Amy B

Reputation: 110111

Hard to say where that DISTINCT is getting lost. Perhaps LinqToSql is dropping it in query translation. Looking at the generated sql will confirm.

If the Distinct is getting (unexpectedly) dropped by LinqToSql, here's another way to write that part of the query.

NumUniqueUsers = pageGroup.GroupBy(x => x.UserName).Count()

Upvotes: 0

Robert Levy
Robert Levy

Reputation: 29073

"3 users, each having 1 hit on each of 3 pages"

I interpret that as meaning your log looks like:

  • User1 - Page1
  • User1 - Page2
  • User1 - Page3
  • User2 - Page1
  • User2 - Page2
  • User2 - Page3
  • User3 - Page1
  • User3 - Page2
  • User3 - Page3

With that scenario, each page does indeed have 3 unique users so your code is correct

Upvotes: 1

Related Questions