Reputation: 325
I have the following query that works fine, but it is giving me counts for a single, whole day (00:00 to 23:59 UTC). For example, it's giving me counts for all of January 1 2017 (00:00 to 23:59 UTC).
My dataset lends itself to be queried from 12:00 UTC to 12:00 UTC. For example, I'm looking for all counts from Jan 1 2017 12:00 UTC to Jan 2 2017 12:00 UTC.
Here is my query:
SELECT count(DISTINCT ltg_data.lat), cwa, to_char(time, 'MM/DD/YYYY')
FROM counties
JOIN ltg_data on ST_contains(counties.the_geom, ltg_data.ltg_geom)
WHERE cwa = 'MFR'
AND time BETWEEN '1987-06-01'
AND '1992-08-1'
GROUP BY cwa, to_char(time, 'MM/DD/YYYY');
FYI...I'm changing the format of the time so I can use the results more readily in javascript.
And a description of the dataset. It's thousands of point data that occurs within various polygons every second. I'm determining if the points are occurring withing the polygon "cwa = MFR" and then counting them.
Thanks for any help!
Upvotes: 0
Views: 41
Reputation: 51609
I see two approaches here.
first, join generate_series(start_date::timestamp,end_date,'12 hours'::interval)
to get count in those generate_series
. this would be more correct I believe. But it has a major minus - you have to lateral join
it against existing data set to use min(time)
and max(time)
...
second, a monkey hack itself, but much less coding and less querying. Use different time zone to make 12:00 a start of the day, eg (you did not give the sample, so I generate content of counties with generate_series
with 2 hours interval as sample data):
t=# with counties as (select generate_series('2017-09-01'::timestamptz,'2017-09-04'::timestamptz,'2 hours'::interval)
g)
select count(1),to_char(g,'MM/DD/YYYY') from counties
group by to_char(g,'MM/DD/YYYY')
order by 2;
count | to_char
-------+------------
12 | 09/01/2017
12 | 09/02/2017
12 | 09/03/2017
1 | 09/04/2017
(4 rows)
so for UTC
time zone there are 12 two hours interval rows for days above, due to inclusive nature of generate_series
in my sample, 1 row for last days. in general: 37 rows.
Now a monkey hack:
t=# with counties as (select generate_series('2017-09-01'::timestamptz,'2017-09-04'::timestamptz,'2 hours'::interval)
g)
select count(1),to_char(g at time zone 'utc+12','MM/DD/YYYY') from counties
group by to_char(g at time zone 'utc+12','MM/DD/YYYY')
order by 2;
count | to_char
-------+------------
6 | 08/31/2017
12 | 09/01/2017
12 | 09/02/2017
7 | 09/03/2017
(4 rows)
I select same dates for different time zone, switching it exactly 12 hours, getting first day starting at 31 Aug middday, not 1 Sep midnight, and the count changes, still totalling 37 rows, but grouping your requested way...
update
for your query I'd try smth like:
SELECT count(DISTINCT ltg_data.lat), cwa, to_char(time at time zone 'utc+12', 'MM/DD/YYYY')
FROM counties
JOIN ltg_data on ST_contains(counties.the_geom, ltg_data.ltg_geom)
WHERE cwa = 'MFR'
AND time BETWEEN '1987-06-01'
AND '1992-08-1'
GROUP BY cwa, to_char(time at time zone 'utc+12', 'MM/DD/YYYY');
also if you want to apply +12 hours logic to where clause - add at time zone 'utc+12'
to "time" comparison as well
Upvotes: 1