sql_learner
sql_learner

Reputation: 547

Get next purchase date for specific set of users

I have subscriptions table shown as below

+---------+-------------+------------+------------+
| user_id | purchase_id | start_date |  end_date  |
+---------+-------------+------------+------------+
|       1 | 1A          | 2019-01-12 | 2019-01-22 |
|       1 | 1B          | 2019-01-24 | 2019-02-03 |
|       1 | 1C          | 2019-02-06 | 2019-02-16 |
|       1 | 1D          | 2019-03-12 | 2019-03-22 |
|       1 | 1E          | 2019-04-01 | 2019-04-11 |
|       2 | 2A          | 2019-02-12 | 2019-02-22 |
|       2 | 2B          | 2019-02-23 | 2019-03-05 |
|       2 | 2C          | 2019-03-06 | 2019-03-16 |
|       2 | 2D          | 2019-03-18 | 2019-03-28 |
|       3 | 3A          | 2019-02-12 | 2019-02-22 |
|       3 | 3B          | 2019-02-23 | 2019-03-05 |
|       3 | 3C          | 2019-03-06 | 2019-03-16 |
|       3 | 3D          | 2019-03-18 | 2019-03-28 |
|       4 | 4A          | 2019-04-10 | 2019-04-20 |
|       4 | 4B          | 2019-04-25 | 2019-05-05 |
|       4 | 4C          | 2019-05-13 | 2019-05-23 |
+---------+-------------+------------+------------+

It has information about when a user(user_id) bought particular subscription(start_date). Each purchase has a purchase_id associated with it.

I want to know what is the start_date of the next subscription that user has bought, only for certain set of purchase_id.These purchase_id are in the below table

+---------+-------------+------------+------------+
| user_id | purchase_id | start_date |  end_date  |
+---------+-------------+------------+------------+
|       1 | 1B          | 2019-01-24 | 2019-02-03 |
|       2 | 2C          | 2019-03-06 | 2019-03-16 |
|       3 | 3D          | 2019-03-18 | 2019-03-28 |
+---------+-------------+------------+------------+

So for user_id = 1, I want to know what is the next start_date. But that start_date should be of the next purchase made after purchase_id = 1B (It is 1C in this case)

Similarly for user_id = 2, I want to know what is the next start_date. But that start_date should be of the next purchase made after purchase_id = 2C (It is 2D in this case)

My desired result is as shown below

+---------+-------------+------------+------------+--------------------+------------------+
| user_id | purchase_id | start_date |  end_date  | Next_Purchase_date | Next_purchase_id |
+---------+-------------+------------+------------+--------------------+------------------+
|       1 | 1B          | 2019-01-24 | 2019-02-03 | 2019-02-06         | 1C               |
|       2 | 2C          | 2019-03-06 | 2019-03-16 | 2019-03-18         | 2D               |
|       3 | 3D          | 2019-03-18 | 2019-03-28 | NULL               | NULL             |
+---------+-------------+------------+------------+--------------------+------------------+

I was able to get the next purchase dates with below code

SELECT A.*, 
LEAD(start_date) OVER(PARTITION BY user_id ORDER BY start_date) as Next_Purchase_date,
LEAD(purchase_id) OVER(PARTITION BY user_id ORDER BY start_date) as Next_Purchase_id
FROM Table1 A

But how can I get next_purchase_date only for certain purchase_ids ?

SQL Fiddle

Upvotes: 1

Views: 546

Answers (2)

GMB
GMB

Reputation: 222582

Assuming that you do have a second table, say table2 that stores the records for which you want to see the next purchases, you could compute the leads in a subquery, and then join it with the second table:

    select t1.*
    from (
        select
            t1.*, 
            lead(start_date) over(partition by user_id order by start_date) next_purchase_date,
            lead(purchase_id) over(partition by user_id order by start_date) next_purchase_id
        from table1 t1
    ) t1
    inner join table2 t2
        on t1.user_id = t2.user_id
        and t1.purchase_id = t2.purchase_id
        and t1.start_date = t2.start_date
        and t1.end_date = t2.end_date

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1270523

If I understand correctly, you want to filter the LEAD(). But if you add a WHERE clause, it will impact the window function.

So, use a subquery and filter after calculating the LEAD():

SELECT A.*
FROM (SELECT A.*, 
             LEAD(start_date) OVER(PARTITION BY user_id ORDER BY start_date) as Next_Purchase_date,
             LEAD(purchase_id) OVER(PARTITION BY user_id ORDER BY start_date) as Next_Purchase_id
      FROM Table1 A
     ) a JOIN
     B b
     ON b.user_id = a.user_id AND
        b.purchase_id = a.purchase_id;

Note: you might actually want b left join a in this case, if there is a possibility that some rows in b don't have corresponding rows in a.

Upvotes: 1

Related Questions