Reputation: 109
I've created a query that returns a counts the number of rows (records) for the last 31 days (based on a timestamp
field) and include the previous 31 days before that period as well, eg. produce a query that returns both. I now have the following query:
SELECT
COUNT(*) OVER(ORDER BY datetime DESC RANGE BETWEEN 2678400000 PRECEDING AND CURRENT ROW) AS rolling_avg_31_days,
COUNT(*) OVER(ORDER BY datetime DESC RANGE BETWEEN 5356800000 PRECEDING AND CURRENT ROW) AS rolling_avg_62_days
FROM `p`
ORDER BY rolling_avg_31_days DESC LIMIT 1
And it returns some data, but not really the data I was hoping for:
rolling_avg_31_days | rolling_avg_62_days
8,422,783 | 9,790,304
If I query the same table with (rolling 62 days):
SELECT COUNT(*) FROM `p`
WHERE datetime > UNIX_MILLIS(CURRENT_TIMESTAMP)-5356800000 AND datetime < UNIX_MILLIS(CURRENT_TIMESTAMP)-2678400000'
I get a value of 6,192,920
I'm not sure what I'm doing wrong. Any help is much appreciated!
Upvotes: 2
Views: 114
Reputation: 109
Instead of going with the above, I've decided to change the query to be a bit simpler:
SELECT
(SELECT COUNT(DISTINCT(wasabi_user_id)) FROM `p` WHERE datetime > UNIX_MILLIS(CURRENT_TIMESTAMP)-5356800000 AND datetime < UNIX_MILLIS(CURRENT_TIMESTAMP)-2678400000) as _62days,
(SELECT COUNT(DISTINCT(wasabi_user_id)) FROM `p` WHERE datetime > UNIX_MILLIS(CURRENT_TIMESTAMP)-2678400000) AS _31days
FROM `mycujoo_kafka_public.v_web_event_pageviews` LIMIT 1
Thanks @Mikhail for the help though!
Upvotes: 0
Reputation: 173046
So, the first query is correct and gives you rolling counts (31 and 62 days) based on the timestamp field - also, because of order by .. desc
and limit 1
you are getting the most row that has biggest rolling_avg_31_days
which is not necessarily row for the most recent () datetime
The second query just produces count of rows between 62 and 31 days based on the current timestamp - which is as explain above is not what first query produces - thus the discrepancy
To further troubleshoot
or to try to understand difference - change ORDER BY rolling_avg_31_days DESC LIMIT 1
to ORDER BY datetime DESC LIMIT 1
and also add datetime
to select statement so you can see if it belong to current date or close to current statement so results are comparable
Upvotes: 1