Get the groups items after a GROUP BY in SQL Server

Question

I have a table that has these columns:

UserID1, UserID2, ProductID, PurchaseDate

The following query run in a purchases table and return couple of users that have more than a number of interactions between them, regardless the order in the last 31 days:

DECLARE @threshold AS INT
DECLARE @days AS INT

SET @threshold = 10
SET @days = 31

SELECT 
    UserID1, UserID2, COUNT(*) AS Counter
FROM 
    (SELECT
        --do this to revert columns and count as one case both Col1,Col2 and Col2,Col1
        CASE 
           WHEN UserID1 < UserID2 
              THEN UserID1 
              ELSE UserID2 
        END AS UserID1,
        CASE 
           WHEN UserID1 < UserID2 
              THEN UserID2 
              ELSE UserID1 
        END AS UserID2
    FROM
        Purchases WITH(NOLOCK)
    WHERE 
        Deadline BETWEEN DATEADD(day, -@days, GETDATE()) AND GETDATE()) t
GROUP BY 
    UserID1, UserID2
HAVING 
    COUNT(*) > @threshold

Yields to:

UserID1  UserID2  Counter
1        2        10
3        2        5
4        1        8

However, what I want is to return a table with the ProductID and the PurchaseDate in separate rows like this

UserID1  UserID2  ProductID  PurchaseDate
1        2        12345      2017-01-18 00:13:52
1        2        5425       2017-01-12 15:10:02
1        2        64362      2017-01-05 10:10:02
..... for the 10 interactions
3        2        25235      2017-01-18 00:13:52
3        2        436346     2017-01-14 00:13:52
..... for the 5 interactions
4        1        23523      2017-01-14 00:13:52
4        1        135135     2017-01-09 00:13:52
..... for the 8 interactions

Is there any way without putting the results of the first query in a temp table and then join it again with the Purchases table to find all the purchases?

Vladimir Baranov · Accepted Answer

If I understood you correctly, then simple windowed COUNT would help here.

The optimiser should be smart enough to do it in one scan of the table.

DECLARE @threshold AS INT;
DECLARE @days AS INT;

SET @threshold = 10;
SET @days = 31;

WITH
CTE_Purchases
AS
(
    SELECT
        --do this to revert columns and count as one case both Col1,Col2 and Col2,Col1
        CASE 
            WHEN UserID1 < UserID2 
            THEN UserID1 
            ELSE UserID2 
        END AS UserID1
        ,CASE 
            WHEN UserID1 < UserID2 
            THEN UserID2 
            ELSE UserID1 
        END AS UserID2
        ,ProductID
        ,PurchaseDate
    FROM
        Purchases
    WHERE 
        Deadline BETWEEN DATEADD(day, -@days, GETDATE()) AND GETDATE()
)
,CTE_Counts
AS
(
    SELECT
        UserID1
        ,UserID2
        ,ProductID
        ,PurchaseDate
        ,COUNT(*) OVER (PARTITION BY UserID1, UserID2) AS Counter
        -- calc COUNT for groups without explicit GROUP BY
    FROM CTE_Purchases
)
SELECT
    UserID1
    ,UserID2
    ,ProductID
    ,PurchaseDate
    ,Counter
FROM CTE_Counts
WHERE Counter > @threshold
-- this filter is instead of your HAVING
;

Get the groups items after a GROUP BY in SQL Server

Answers (2)

Related Questions