psrpsrpsr
psrpsrpsr

Reputation: 457

How can I return rows that meet criteria for occurring in one day, but over a date range?

I have a query (shown below) that returns all rows for UserID that have :

  1. a JOIN,
  2. a subsequent CANCEL, and then
  3. a subsequent JOIN

But: I need to return UserIDs that meet this criteria of having a JOIN,CANCEL, then JOIN in sequence ON THE SAME DAY, but for a date range: for example BETWEEN 2016-11-01 and 2016-11-30. So in the example table below, UserIDs 12345, 9876, and 33445 would be returned.

I'm not sure how this is achieved - would be involve some sort of grouping on the timestamp date? Would a stored procedure that iterates over conditional tests for UserID and ActionType be a viable solution?

+--------+--------+----------------------+------------+------------------+
| rownum | UserID |      Timestamp       | ActionType | Return in query? |
+--------+--------+----------------------+------------+------------------+
|      1 |  12345 | 2016-11-01 08:25:39  | JOIN       | yes              |
|      2 |  12345 | 2016-11-01 08:27:00  | NULL       | yes              |
|      3 |  12345 | 2016-11-01 08:28:20  | DOWNGRADE  | yes              |
|      4 |  12345 | 2016-11-01 08:31:34  | NULL       | yes              |
|      5 |  12345 | 2016-11-01 08:32:44  | CANCEL     | yes              |
|      6 |  12345 | 2016-11-01 08:45:51  | NULL       | yes              |
|      7 |  12345 | 2016-11-01 08:50:57  | JOIN       | yes              |

|      1 |   9876 | 2016-11-01 16:05:42  | JOIN       | yes              |
|      2 |   9876 | 2016-11-01 16:07:33  | CANCEL     | yes              |
|      3 |   9876 | 2016-11-01 16:09:09  | JOIN       | yes              |

|      1 |  56565 | 2016-11-01 18:15:16  | JOIN       | no               |
|      2 |  56565 | 2016-11-01 19:22:25  | CANCEL     | no               |
|      3 |  56565 | 2016-11-01 20:05:05  | CANCEL     | no               |

|      1 |  34343 | 2016-11-01 05:32:56  | JOIN       | no               |

                        NEXT DAY

|      1 |   7878 | 2016-11-02 10:05:04  | JOIN       | no               |
|      2 |   7878 | 2016-11-02 10:06:06  | JOIN       | no               |

|      1 |  33445 | 2016-11-02 02:33:34  | JOIN       | yes              |
|      2 |  33445 | 2016-11-02 02:33:34  | NULL       | yes              |
|      3 |  33445 | 2016-11-02 02:37:56  | CANCEL     | yes              |
|      4 |  33445 | 2016-11-02 02:38:01  | JOIN       | yes              |
+--------+--------+----------------------+------------+------------------+

Here is a link to the question which led me to the query that pulls data for exactly one day (not a range): How can I return rows that meet a specific sequence of events?

Here is the query:

SELECT *
  FROM T
 WHERE USERID IN (
    select distinct userid
      from            t first_join
           inner join t cancel
                   on first_join.tmstmp < cancel.tmstp
                  and first_join.userid = cancel.userid
           inner join t.second_join
                   on second_join.tmstmp > cancel.tmstp
                  and second_join.userid = cancel.userid
     where first_join.actiontype = 'JOIN'
       and cancel.actiontype = 'CANCEL'
       and second_join.actiontype = 'JOIN'
     )

Clarification of comments/questions: vkp:

  1. QUESTION: can join,cancel,join be on different days with other values in between? ANSWER: No, I need to find the join>cancel>join that occur in one day only. If there is a join on 11/1 and a cancel on 11/2, that UserID does not need to be returned.
  2. QUESTION: if a particular date satisfies cancel,join,cancel in the date range, will that be enough for a user to be included in the results? ANSWER: No, I am specifically looking at rows that meet the ActionType sequence in one day, not over a range of days.

THANK YOU!

Upvotes: 0

Views: 79

Answers (1)

Vamsi Prabhala
Vamsi Prabhala

Reputation: 49260

To get all the users and the days when they have the specified sequence of events happen, use

select distinct userid,tmstamp::date
from t
where ActionType = 'CANCEL' and tmstamp::date between date '2016-11-01' and date '2016-11-30' and
      exists (select 1
              from t t2
              where t2.userId = t.userId and
                    t2.actiontype = 'JOIN' and
                    t2.tmstamp < t.tmstamp and 
                    t2.tmstamp::date = t.tmstamp::date
             ) and
      exists (select 1
              from t t3
              where t3.userId = t.userId and
                    t3.actiontype = 'JOIN' and
                    t3.tmstamp > t.tmstamp and
                    t3.tmstamp::date = t.tmstamp::date
             )

To get all the rows for such users on those days, wrap the previous query as a subquery against the original table.

select * from t where (userid,tmstamp::date) in (
select distinct userid,tmstamp::date
from t
where ActionType = 'CANCEL' 
and tmstamp::date between date '2016-11-01' and date '2016-11-30' and
      exists (select 1
              from t t2
              where t2.userId = t.userId and
                    t2.actiontype = 'JOIN' and
                    t2.tmstamp < t.tmstamp and 
                    t2.tmstamp::date = t.tmstamp::date
             ) and
      exists (select 1
              from t t3
              where t3.userId = t.userId and
                    t3.actiontype = 'JOIN' and
                    t3.tmstamp > t.tmstamp and
                    t3.tmstamp::date = t.tmstamp::date
             )
    )

Sample Demo

Note that this is a minor tweak to @Gordon's query (to check for these sequence of events on a particular day) in your previous question which i felt was the best.

Edit: An alternate approach with window functions

select * from t where (userid,tmstamp::date) in (
    select distinct userid,tmstamp::date from (
    select t.*
    ,min(case when actiontype = 'JOIN' then 1 else 2 end) over(partition by t.userid,t.tmstamp::date order by t.tmstamp rows between unbounded preceding and 1 preceding) min_before
    ,min(case when actiontype = 'JOIN' then 1 else 2 end) over(partition by t.userid,t.tmstamp::date order by t.tmstamp rows between 1 following and unbounded following) min_after
    from (select userid,tmstamp from t where actiontype='CANCEL') tc
    join t on t.userid=tc.userid and t.tmstamp::date=tc.tmstamp::date
    ) x 
    where min_before=1 and min_after=1
        ) 

1) Using a case expression we designate all actiontype JOIN rows as 1 and 2 for all other actiontypes.

2) We join it with the actiontype CANCEL rows.

3) Then we check for the minimum value before CANCEL and minimum value after CANCEL for each date and userid combination. Per the case expression defined, it should be 1.

4) Get all such dates and userid's and fetch the corresponding rows.

Upvotes: 1

Related Questions