Reputation: 654
I have a table called activities which has a memberId and a timestamp. I would like to find out in a given month how many members performed an activity (ie - had a record in the activities table) who did not have an activity in the previous 12 months. I believe lead/lag would be helpful here, but I'm having trouble wrapping my brain around it.
(I've tagged both Apache Hadoop and MS SQL Server here because I could potentially do this in either, and I think I could translate a solution for one to the other pretty easily).
Any help appreciated!
Thank you!
Upvotes: 0
Views: 2874
Reputation: 4100
When using the LAG function, we need to create a single record per member and month first, use the LAG function to get the preious activity month and finally use a where clause to only get what we want:
DECLARE
@year int = 2018,
@month int = 7;
WITH
monthwise (MemberID, FirstOfMonth) AS (
SELECT DISTINCT MemberID, DATEADD(month, DATEDIFF(month, 0, ActivityDate), 0)
FROM Activities
),
prevActivity (MemberID, FirstOfMonth, prevFirstOfMonth) AS (
SELECT MemberID, FirstOfMonth
, LAG(FirstOfMonth) OVER (PARTITION BY MemberID ORDER BY FirstOfMonth)
FROM monthwise
)
SELECT MemberID
FROM prevActivity
WHERE MONTH(FirstOfMonth) = @month
AND YEAR(FirstOfMonth) = @year
AND (prevFirstOfMonth IS NULL OR DATEDIFF(month, prevFirstOfMonth, FirstOfMonth) > 12)
You can also do this without the LAG function: Use two queries, one for the members that have an activity in this month, one for the members that have an activity in the previous twelve months. Then use inner and left joins to find the members with activities in this month without activities in the previous months:
DECLARE
@year int = 2018,
@month int = 7;
WITH
this (MemberID) AS (
SELECT DISTINCT MemberID
FROM Activities
WHERE YEAR(ActivityDate) = @year
AND MONTH(ActivityDate) = @month
),
prev (MemberID) AS (
SELECT DISTINCT MemberID
FROM Activities
WHERE ActivityDate < DATEADD(month, @month-1 +12*(@year-1900), 0)
AND ActivityDate >= DATEADD(month, @month-1 +12*(@year-1901), 0)
)
SELECT m.MemberID
FROM Members m
INNER JOIN this ON m.MemberID = this.MemberID
LEFT JOIN prev ON m.MemberID = prev.MemberID
WHERE prev.MemberID IS NULL
Upvotes: 2
Reputation: 1271151
You can do this using lag()
:
select year(ts), month(ts),
(count(distinct memberid) -
count(distinct case when prev_ts > dateadd(year, -1, ts) then memberid)
) as
from (select memberid,
lag(ts) over (partition by memberid order by ts) as prev_ts
from activities a
) a
group by year(ts), month(ts);
Upvotes: -1