Reputation: 9008
I have table users with the following values:
id | created_at
-------+---------------------
20127 | 2015-01-31 04:23:46
21468 | 2015-02-04 07:50:34
21571 | 2015-02-04 08:23:50
20730 | 2015-03-12 10:20:16
19955 | 2015-03-30 07:44:35
20148 | 2015-04-17 13:03:26
21552 | 2015-05-07 19:00:00
20145 | 2015-06-02 03:12:46
21467 | 2015-06-03 13:21:51
21074 | 2015-07-03 19:00:00
I want to:
I'm trying to achieve this with the following SQL:
SELECT extract(epoch from created_at)::bigint,
sum(count(id)::integer) OVER (ORDER BY created_at)
FROM data_users
WHERE created_at IS NOT NULL
GROUP BY created_at
But it's not working as expected since I can't add filtering by date here, without excluding records from the cumulative sum. Also it doesn't take into account days that have been missed (those for which the users don't exist).
Any help greatly appreciated.
Upvotes: 0
Views: 1395
Reputation: 17721
As far as I understand your question a simple query with GROUP BY should be enough. You can use a left outer join with GENERATE_SERIES()
to get all dates in the range. If you have the start and end date of the range, you can use this:
SELECT EXTRACT(EPOCH FROM d)::BIGINT, COALESCE(COUNT(u.id), 0)
FROM GENERATE_SERIES(start, end, '1 DAY'::INTERVAL) d
LEFT OUTER JOIN data_users u ON u.created_at::DATE = d
GROUP BY 1 ORDER BY 1
You can determine start
and end
from your table, too:
SELECT EXTRACT(EPOCH FROM d.date)::BIGINT, COALESCE(COUNT(u.id), 0)
FROM
(SELECT GENERATE_SERIES(MIN(created_at)::DATE, MAX(created_at)::DATE, '1 DAY'::INTERVAL) AS date
FROM data_users) d
LEFT OUTER JOIN data_users u ON u.created_at::DATE = d.date::DATE
GROUP BY 1 ORDER BY 1;
This returns:
date_part | coalesce
------------+----------
1422662400 | 1
1422748800 | 0
1422835200 | 0
1422921600 | 0
1423008000 | 2
1423094400 | 0
1423180800 | 0
...
1435536000 | 0
1435622400 | 0
1435708800 | 0
1435795200 | 0
1435881600 | 1
With this query you can get the cumulative sum for the rows before a start date:
SELECT EXTRACT(EPOCH FROM GREATEST(d.date, start))::BIGINT, COALESCE(COUNT(u.id), 0)
FROM
(SELECT GENERATE_SERIES(MIN(created_at)::DATE, MAX(created_at)::DATE, '1 DAY'::INTERVAL) AS date
FROM data_users) d
LEFT OUTER JOIN data_users u ON u.created_at::DATE = d.date::DATE
GROUP BY 1 ORDER BY 1;
Upvotes: 1