Nikhil S
Nikhil S

Reputation: 65

Find if a condition stays true between a range of dates

I have a "action" table:: id, type, status, created, live_at, expired_at

+-----------+-----------+------------+-------------------------------+
| id        | type      | status    | created | live_at | expired_at |
| (uuid)    | (string)  | (string)  | (date)  | (date)  |  (date)    |
+-----------+-----------+------------+-------------------------------+

Example rows are:

  10f1dc79-61b7-46a4-ad66-55e2a68b7148 | FACEBOOK_SOCIAL_SHARE | EXPIRED | 2019-06-21 11:28:31 | 2019-07-21 11:28:36 | 2019-10-02 11:40:27 
  3e59ccd4-a795-4e74-b841-4da1e57fb51f | FACEBOOK_SOCIAL_SHARE | LIVE    | 2019-10-04 18:25:57 | 2019-10-04 18:25:57 | NULL                

I have to run a query where I get all live actions grouped by type and month.

Example Result:

TYPE                    MONTH LIVE
FACEBOOK_SOCIAL_SHARE     7     1
FACEBOOK_SOCIAL_SHARE     8     1
FACEBOOK_SOCIAL_SHARE     9     5
FACEBOOK_SOCIAL_SHARE     10    9

The problem is if an action went live in month 8 and expired some day in month of 10, then the query should count that action as live in the month 8, 9 and 10 also.

I have a query but it would only count this action as live in the month of 8!

SELECT TYPE, EXTRACT(MONTH FROM action.live_at) AS month, count(distinct(action.id)) AS live
FROM "action" AS action
WHERE action.live_at IS NOT NULL
GROUP BY TYPE, EXTRACT(MONTH FROM action.live_at)

Any help would be really appreciated.

Upvotes: 0

Views: 225

Answers (1)

user330315
user330315

Reputation:

As far as I understand your question, I think the following will do what you want:

with actions as (
  select id, type, 
         array(select extract(month from x.dt)::int
               from generate_series(date_trunc('month', live_at), 
                                    date_trunc('month', coalesce(expired_at, current_timestamp)) + interval '1 month' - interval '1 day', 
                                    interval '1 month') as x(dt)) as months_live
  from  action
)
select m.month, type, count(distinct a.id)
from generate_series(1,12) as m(month)
  left join actions a on m.month = any(a.months_live)
group by m.month, type;

The CTE generates an array of all months for each row in the action table. So for your two sample rows this would return

id                                   | months_live
-------------------------------------+------------
10f1dc79-61b7-46a4-ad66-55e2a68b7148 | {7,8,9,10} 
3e59ccd4-a795-4e74-b841-4da1e57fb51f | {10,11}    

The expression date_trunc('month', coalesce(expired_at, current_timestamp)) + interval '1 month' - interval '1 day' yields the last day of the month that expired_at contains. That's necessary so that generate_series() includes that month as well.

I don't now how a null value in the expired_at column should be treated - the above expressions simply uses "today" then.

The outer query then does an outer join between a list of 12 months and the actions - as the join condition is based on the array, a row from the actions table is repeated several times because the join condition matches multiple times.

The outer join (without grouping) returns the following rows (based on your two sample rows and today being a day in November):

month | type                  | id                                  
------+-----------------------+-------------------------------------
    1 |                       |                                     
    2 |                       |                                     
    3 |                       |                                     
    4 |                       |                                     
    5 |                       |                                     
    6 |                       |                                     
    7 | FACEBOOK_SOCIAL_SHARE | 10f1dc79-61b7-46a4-ad66-55e2a68b7148
    8 | FACEBOOK_SOCIAL_SHARE | 10f1dc79-61b7-46a4-ad66-55e2a68b7148
    9 | FACEBOOK_SOCIAL_SHARE | 10f1dc79-61b7-46a4-ad66-55e2a68b7148
   10 | FACEBOOK_SOCIAL_SHARE | 10f1dc79-61b7-46a4-ad66-55e2a68b7148
   10 | FACEBOOK_SOCIAL_SHARE | 3e59ccd4-a795-4e74-b841-4da1e57fb51f
   11 | FACEBOOK_SOCIAL_SHARE | 3e59ccd4-a795-4e74-b841-4da1e57fb51f
   12 |                       |                                     

This result is then grouped by month and type to be able to count the IDs.

So your two sample rows would return:

month | type                  | count
------+-----------------------+------
    1 |                       |     0
    2 |                       |     0
    3 |                       |     0
    4 |                       |     0
    5 |                       |     0
    6 |                       |     0
    7 | FACEBOOK_SOCIAL_SHARE |     1
    8 | FACEBOOK_SOCIAL_SHARE |     1
    9 | FACEBOOK_SOCIAL_SHARE |     1
   10 | FACEBOOK_SOCIAL_SHARE |     2
   11 | FACEBOOK_SOCIAL_SHARE |     1
   12 |                       |     0

Online example: https://rextester.com/NYUV51842

If you need that very often, consider writing a function:

create or replace function get_month_list(p_start timestamp, p_end timestamp)  
  returns int[]
as
$$
 select array(select extract(month from x.dt)::int
              from generate_series(date_trunc('month', p_start), 
                            date_trunc('month', coalesce(p_end, current_timestamp)) + interval '1 month' - interval '1 day', 
                            interval '1 month') as x(dt));
$$ 
language sql
immutable;

Then the query is a bit easier to read:

select m.month, type, count(distinct a.id)
from generate_series(1,12) as m(month)
  left join action a on m.month = any(get_month_list(a.live_at, a.expired_at))
group by m.month, type;       

Upvotes: 1

Related Questions