Reputation: 457
I have a query (shown below) that returns all rows for UserID that have :
But: I need to return UserIDs that meet this criteria of having a JOIN,CANCEL, then JOIN in sequence ON THE SAME DAY, but for a date range: for example BETWEEN 2016-11-01 and 2016-11-30. So in the example table below, UserIDs 12345, 9876, and 33445 would be returned.
I'm not sure how this is achieved - would be involve some sort of grouping on the timestamp date? Would a stored procedure that iterates over conditional tests for UserID and ActionType be a viable solution?
+--------+--------+----------------------+------------+------------------+
| rownum | UserID | Timestamp | ActionType | Return in query? |
+--------+--------+----------------------+------------+------------------+
| 1 | 12345 | 2016-11-01 08:25:39 | JOIN | yes |
| 2 | 12345 | 2016-11-01 08:27:00 | NULL | yes |
| 3 | 12345 | 2016-11-01 08:28:20 | DOWNGRADE | yes |
| 4 | 12345 | 2016-11-01 08:31:34 | NULL | yes |
| 5 | 12345 | 2016-11-01 08:32:44 | CANCEL | yes |
| 6 | 12345 | 2016-11-01 08:45:51 | NULL | yes |
| 7 | 12345 | 2016-11-01 08:50:57 | JOIN | yes |
| 1 | 9876 | 2016-11-01 16:05:42 | JOIN | yes |
| 2 | 9876 | 2016-11-01 16:07:33 | CANCEL | yes |
| 3 | 9876 | 2016-11-01 16:09:09 | JOIN | yes |
| 1 | 56565 | 2016-11-01 18:15:16 | JOIN | no |
| 2 | 56565 | 2016-11-01 19:22:25 | CANCEL | no |
| 3 | 56565 | 2016-11-01 20:05:05 | CANCEL | no |
| 1 | 34343 | 2016-11-01 05:32:56 | JOIN | no |
NEXT DAY
| 1 | 7878 | 2016-11-02 10:05:04 | JOIN | no |
| 2 | 7878 | 2016-11-02 10:06:06 | JOIN | no |
| 1 | 33445 | 2016-11-02 02:33:34 | JOIN | yes |
| 2 | 33445 | 2016-11-02 02:33:34 | NULL | yes |
| 3 | 33445 | 2016-11-02 02:37:56 | CANCEL | yes |
| 4 | 33445 | 2016-11-02 02:38:01 | JOIN | yes |
+--------+--------+----------------------+------------+------------------+
Here is a link to the question which led me to the query that pulls data for exactly one day (not a range): How can I return rows that meet a specific sequence of events?
Here is the query:
SELECT *
FROM T
WHERE USERID IN (
select distinct userid
from t first_join
inner join t cancel
on first_join.tmstmp < cancel.tmstp
and first_join.userid = cancel.userid
inner join t.second_join
on second_join.tmstmp > cancel.tmstp
and second_join.userid = cancel.userid
where first_join.actiontype = 'JOIN'
and cancel.actiontype = 'CANCEL'
and second_join.actiontype = 'JOIN'
)
Clarification of comments/questions: vkp:
THANK YOU!
Upvotes: 0
Views: 79
Reputation: 49260
To get all the users and the days when they have the specified sequence of events happen, use
select distinct userid,tmstamp::date
from t
where ActionType = 'CANCEL' and tmstamp::date between date '2016-11-01' and date '2016-11-30' and
exists (select 1
from t t2
where t2.userId = t.userId and
t2.actiontype = 'JOIN' and
t2.tmstamp < t.tmstamp and
t2.tmstamp::date = t.tmstamp::date
) and
exists (select 1
from t t3
where t3.userId = t.userId and
t3.actiontype = 'JOIN' and
t3.tmstamp > t.tmstamp and
t3.tmstamp::date = t.tmstamp::date
)
To get all the rows for such users on those days, wrap the previous query as a subquery against the original table.
select * from t where (userid,tmstamp::date) in (
select distinct userid,tmstamp::date
from t
where ActionType = 'CANCEL'
and tmstamp::date between date '2016-11-01' and date '2016-11-30' and
exists (select 1
from t t2
where t2.userId = t.userId and
t2.actiontype = 'JOIN' and
t2.tmstamp < t.tmstamp and
t2.tmstamp::date = t.tmstamp::date
) and
exists (select 1
from t t3
where t3.userId = t.userId and
t3.actiontype = 'JOIN' and
t3.tmstamp > t.tmstamp and
t3.tmstamp::date = t.tmstamp::date
)
)
Note that this is a minor tweak to @Gordon's query (to check for these sequence of events on a particular day) in your previous question which i felt was the best.
Edit: An alternate approach with window functions
select * from t where (userid,tmstamp::date) in (
select distinct userid,tmstamp::date from (
select t.*
,min(case when actiontype = 'JOIN' then 1 else 2 end) over(partition by t.userid,t.tmstamp::date order by t.tmstamp rows between unbounded preceding and 1 preceding) min_before
,min(case when actiontype = 'JOIN' then 1 else 2 end) over(partition by t.userid,t.tmstamp::date order by t.tmstamp rows between 1 following and unbounded following) min_after
from (select userid,tmstamp from t where actiontype='CANCEL') tc
join t on t.userid=tc.userid and t.tmstamp::date=tc.tmstamp::date
) x
where min_before=1 and min_after=1
)
1) Using a case
expression we designate all actiontype JOIN
rows as 1 and 2 for all other actiontypes.
2) We join
it with the actiontype CANCEL
rows.
3) Then we check for the minimum value before CANCEL
and minimum value after CANCEL
for each date and userid combination. Per the case
expression defined, it should be 1.
4) Get all such dates and userid's and fetch the corresponding rows.
Upvotes: 1