siva
siva

Reputation: 629

Athena/SQL query to get the desired result

sample_input_table

user  name  action      date
 1    aaa    view      2020-09-03
 2    bbb    view      2020-09-02
 3    ccc    view      2020-08-28
 4    ddd    view      2020-08-25
 1    aaa    purchase  2020-09-09

I have a table with huge number of rows, the table looks like above.

question

  1. i want to print the rows which have purchase action and
  2. at the same time, the user who did purchase must have row with view action
  3. and at the same time, that view action will be in the date range of purchase_date(2020-09-09) and purchase_date - 7days(2020-09-02).

I want to achieve these 3 point in one sql query

sample_output

user  name  action      date
1    aaa    purchase  2020-09-09

if we see sample output from the sample input

  1. our end result have only purchase_events
  2. purchased_user had a row with view action
  3. and that view was there in the timeframe of 2020-09-09 and 2020-09-02(purchased_date, purchased_date - 7 days)

Can anyone suggest some solution for this?

Upvotes: 2

Views: 385

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269773

You can use window functions. Assuming "purchase" is the last state:

select t.*
from (select t.*,
             max(case when action = 'purchase' then date end) over (partition by user) as purchase_date,
             max(case when action = 'view' then date end) over (partition by user) as max_view_date             
      from t
     ) t
where action = 'purchase' and
      max_view_date >= purchase_date - interval '7 day';

Upvotes: 2

GMB
GMB

Reputation: 222462

You can use exists:

select t.*
from mytable t
where t.action = 'purchase' and exists (
    select 1
    from mytable t1
    where 
        t1.user = t.user 
        and t1.action = 'view'
        and t1.date >= t.date - interval '7' day
        and t1.date < t.date
    )

Upvotes: 2

Related Questions