Reputation: 267
I have a DB table:
User_id | User Name | Join_date | Activity_date |
1 abc 01/01/2021 02-01-2021
2 jay 01/01/2021 03-01-2021
2 jay 01/01/2021 04-01-2021
1 abc 01/01/2021 09-01-2021
1 abc 01/01/2021 16-01-2021
2 jay 01/01/2021 05-01-2021
3 xyz 03/03/2021 12-03-2021
3 xyz 03/03/2021 30-03-2021
2 jay 01/01/2021 26-01-2021
I want to bucketize the user based on their Activity_date
, as in users who do the activity every next day into table1, users who do activity weekly(gap of 6-7 days) into table2, all the other users into table3.
The output
something like this:
temporary_table1
: (people who do an activity every next continues day)
User_id | User Name | Join_date | Activity_date |
2 jay 01/01/2021 03-01-2021
2 jay 01/01/2021 04-01-2021
2 jay 01/01/2021 05-01-2021
temporary_table_2
: (people who do activist every 6-7 days from their first Activity_date)
User_id | User Name | Join_date | Activity_date |
1 abc 01/01/2021 02-01-2021
1 abc 01/01/2021 09-01-2021
1 abc 01/01/2021 15-01-2021
temporary_table_3
: (Adding user_id =2 here because from the last activity done by this user to the recent activity done there is more than seven days of gap)
User_id | User Name | Join_date | Activity_date |
3 xyz 03/03/2021 12-03-2021
3 xyz 03/03/2021 30-03-2021
2 jay 01/01/2021 26-01-2021
How to achieve this in SQL (Redshift)?
Upvotes: 0
Views: 59
Reputation: 26
You should be able to do something like this and then you can make your grouping on the date_diff:
SELECT
user_id,
user_name,
join_date
activity_date,
#window function
LAG(activity_date) OVER (PARTITION BY user_id ORDER BY activity_date ASC) as day_before,
#difference in days between activity_date and day_before
DATEDIFF(
day,
LAG(activity_date) OVER (PARTITION BY user_id ORDER BY activity_date ASC),
activity_date
) as date_diff
FROM your_dataset
Upvotes: 1