Reputation: 444
I have a leads
table that looks like below, showing which user (user_uuid
) dropped which lead(lead_id
) along with their subscription start and end date.
lead_id lead_date subs_start subs_end user_uuid subs_id
-------- ----------- ----------- ----------- ------------------ ------
25914112 20-Mar-2019 05-Mar-2019 28-Mar-2019 5447ade6-1401-44bd 79926
25811730 06-Mar-2019 05-Mar-2019 28-Mar-2019 5447ade6-1401-44bd 79926
28597464 08-Mar-2019 05-Mar-2019 28-Mar-2019 5447ade6-1401-44bd 79926
28591844 08-Mar-2019 05-Mar-2019 28-Mar-2019 5447ade6-1401-44bd 79926
25914076 15-Jun-2019 15-Jun-2019 28-Jun-2019 00013889-00ce-4eb3 99927
26006221 17-Jun-2019 15-Jun-2019 28-Jun-2019 00013889-00ce-4eb3 99927
26095282 17-Jun-2019 15-Jun-2019 28-Jun-2019 00013889-00ce-4eb3 99927
26093874 19-Jun-2019 15-Jun-2019 28-Jun-2019 00013889-00ce-4eb3 99927
26091313 27-Jun-2019 15-Jun-2019 28-Jun-2019 00013889-00ce-4eb3 99927
25112857 27-Jun-2019 15-Jun-2019 28-Jun-2019 00013889-00ce-4eb3 99927
subs_start
and subs_end
are the subscription start and end date respectively.
I am trying to find out how many leads are dropped by each user on a daily basis, in the subscription period (between subscription start date and end date),in below format.
user_uuid date leads_dropped
------------------ ----------- ------------
5447ade6-1401-44bd 05-Mar-2019 0
5447ade6-1401-44bd 06-Mar-2019 1
5447ade6-1401-44bd 07-Mar-2019 0
5447ade6-1401-44bd 08-Mar-2019 2
5447ade6-1401-44bd 09-Mar-2019 0
00013889-00ce-4eb3 15-Jun-2019 0
00013889-00ce-4eb3 16-Jun-2019 0
00013889-00ce-4eb3 17-Jun-2019 2
00013889-00ce-4eb3 18-Jun-2019 0
00013889-00ce-4eb3 19-Jun-2019 0
I've tried using the calendar table, but reached no where. Please help me find me the number of leads for the dates only with in the subscription period of every user.
WITH grouped_daily AS (
SELECT user_uuid,lead_creation_date, COUNT(lead_id) cnt
FROM leads
GROUP BY 1,2)
SELECT *
FROM (SELECT dt::date FROM generate_series(date '2019-01-01', date '2019-09-14', INTERVAL '1 day') dt) dt_tbl
LEFT JOIN grouped_daily gd
ON gd.lead_creation_date = dt_tbl.dt
Upvotes: 1
Views: 37
Reputation: 1269503
From what I understand, you need to generate separate dates for each user. So, aggregate at the user level to get the bounds. Then cross join to the dates and bring back in the leads information:
WITH u AS (
SELECT user_uuid, MIN(subs_start) as min_ss, MAX(subs_end) as max_ss
FROM leads
GROUP BY user_uuid
)
SELECT u.user_uuid, d.dt, COUNT(l.user_uuid)
FROM u CROSS JOIN LATERAL
generate_series(min_ss, max_ss, INTERVAL '1 day') d(dt) LEFT JOIN
leads l
ON l.user_uuid = u.user_uuid AND
l.lead_date = d.dt
GROUP BY u.user_uuid, d.dt;
EDIT:
A lateral join just makes the syntax more consistent with SQL -- I am not a big fan of calling table-generating functions in the SELECT
:
WITH u AS (
SELECT user_uuid, MIN(subs_start) as min_ss, MAX(subs_end) as max_ss
FROM leads
GROUP BY user_uuid
)
SELECT u.user_uuid, u.dt, COUNT(l.user_uuid)
FROM (SELECT u.*,
generate_series(min_ss, max_ss, INTERVAL '1 day') as dt
FROM u
) LEFT JOIN
leads l
ON l.user_uuid = u.user_uuid AND
l.lead_date = u.dt
GROUP BY u.user_uuid, u.dt;
Upvotes: 1