Reputation: 457
I am trying to pull records for UserIDs that meet a certain sequence of events. If a user has a JOIN, then a subsequent CANCEL, and then a subsequent JOIN, I want to return them in the result set. I need to run this query for one day at a time, or several days at a time, as needed.
The table below shows examples of UserIDs that meet and do not meet the sequence.
+--------+--------+---------------------+------------+------------------+
| rownum | UserID | Timestamp | ActionType | Return in query? |
+--------+--------+---------------------+------------+------------------+
| 1 | 12345 | 2016-11-01 08:25:39 | JOIN | yes |
| 2 | 12345 | 2016-11-01 08:27:00 | NULL | yes |
| 3 | 12345 | 2016-11-01 08:28:20 | DOWNGRADE | yes |
| 4 | 12345 | 2016-11-01 08:31:34 | NULL | yes |
| 5 | 12345 | 2016-11-01 08:32:44 | CANCEL | yes |
| 6 | 12345 | 2016-11-01 08:45:51 | NULL | yes |
| 7 | 12345 | 2016-11-01 08:50:57 | JOIN | yes |
| 1 | 9876 | 2016-11-01 16:05:42 | JOIN | yes |
| 2 | 9876 | 2016-11-01 16:07:33 | CANCEL | yes |
| 3 | 9876 | 2016-11-01 16:09:09 | JOIN | yes |
| 1 | 56565 | 2016-11-01 18:15:16 | JOIN | no |
| 2 | 56565 | 2016-11-01 19:22:25 | CANCEL | no |
| 3 | 56565 | 2016-11-01 20:05:05 | CANCEL | no |
| 1 | 34343 | 2016-11-01 05:32:56 | JOIN | no |
+--------+--------+---------------------+------------+------------------+
I have read up on gaps and islands, and looked at all sorts of complicated forum posts that dance around what I'm trying to achieve.
Currently, all I'm able to do is look at one day's worth of records, with no constraint on the sequence logic that I need:
SELECT
ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY tmsmp) rownum
,UserID
,tmstmp
,ActionType
FROM
t
INNER JOIN (
SELECT UserID
FROM t
WHERE tmstmp BETWEEN '2016-11-20 00:00:01' AND '2016-11-20 11:59:59'
GROUP BY UserID
HAVING COUNT(*) >= 2
) AS sub ON t1.UserID = sub.UserID
Thank you for your input!
Upvotes: 3
Views: 857
Reputation: 40491
You can use LEAD()
:
SELECT * FROM (
SELECT t.* ,
LAG(t.ActionType,1) OVER(PARTITION BY t.userid ORDER BY t.timestamp) AS LAST_ACTION,
LAG(t.ActionType,2) OVER(PARTITION BY t.userid ORDER BY t.timestamp) AS LAST_ACTION2,
LEAD(t.ActionType,1) OVER(PARTITION BY t.userid ORDER BY t.timestamp) AS NEXT_Action,
LEAD(t.ActionType,2) OVER(PARTITION BY t.userid ORDER BY t.timestamp) AS NEXT_Action2
FROM YourTable t
WHERE tmstmp BETWEEN <Start> AND <End>)
WHERE (t.actionType = 'JOIN' AND
t.NEXT_Action = 'Cancel' AND
t.NEXT_Action2 = 'JOIN')
OR (t.LAST_ACTION= 'JOIN' AND
t.actionType= 'Cancel' AND
t.NEXT_Action = 'JOIN')
OR (t.LAST_ACTION2= 'JOIN' AND
t.LAST_ACTION = 'Cancel' AND
t.actionType= 'JOIN')
Upvotes: 4
Reputation: 45819
In my sample queries I'll do the best I can with the information you've given, but you're a little unclear about what the source table(s) look like. You show one table above (with no name), but then reference two different tables in your sample query... a little hard to see what's going on.
So I'll assume a single table, named t
, and you can adjust as needed...
Then how I would handle this, is first identify the users
select distinct userid
from t first_join
inner join t cancel
on first_join.tmstmp < cancel.tmstp
and first_join.userid = cancel.userid
inner join t.second_join
on second_join.tmstmp > cancel.tmstp
and second_join.userid = cancel.userid
where first_join.actiontype = 'JOIN'
and cancel.actiontype = 'CANCEL'
and second_join.actiontype = 'JOIN'
So now you can get all records for those users
SELECT *
FROM T
WHERE USERID IN (
select distinct userid
from t first_join
inner join t cancel
on first_join.tmstmp < cancel.tmstp
and first_join.userid = cancel.userid
inner join t.second_join
on second_join.tmstmp > cancel.tmstp
and second_join.userid = cancel.userid
where first_join.actiontype = 'JOIN'
and cancel.actiontype = 'CANCEL'
and second_join.actiontype = 'JOIN'
)
Upvotes: 1
Reputation: 1270993
Assuming you mean that the records are in order with no gaps, just use lag()
, lead()
or a combination:
select distinct userId
from (select t.*,
lag(ActionType) over (partition by userId order by tmstamp) as prev_at,
lead(ActionType) over (partition by userId order by tmstamp) as next_at,
from t
) t
where ActionType = 'Cancel' and prev_at = 'Join' and next_at = 'Join';
If gaps are allowed, then you can do this differently:
select distint userid
from t
where ActionType = 'Cancel' and
exists (select 1
from t t2
where t2.userId = t.userId and
t2.at = 'Join' and
t2.tmstamp < t.tmstamp
) and
exists (select 1
from t t2
where t2.userId = t.userId and
t2.at = 'Join' and
t2.tmstamp > t.tmstamp
);
Upvotes: 1