Reputation: 5469
Database looks like:
ID | volume | timestamp (timestamp without time zone)
1 | 300 | 2015-05-27 00:
1 | 250 | 2015-05-28 00:
2 | 13 | 2015-05-25 00:
1 | 500 | 2015-06-28 22:
1 | 100 | 2015-06-28 23:
2 | 11 | 2015-06-28 21:
2 | 15 | 2015-06-28 23:
Is there any way to merge hourly prices history, that oldest than 1 month, to daily and put them back to table? That means merge hourly records into 1 record, with sum volume and timestamp of 00 hour (I mean only day, 2013-08-15 00:00:00).
So, wanted result:
ID | volume | timestamp
1 | 300 | 2015-05-27 00:
1 | 250 | 2015-05-28 00:
2 | 13 | 2015-05-25 00:
1 | 600 | 2015-06-28 00:
2 | 26 | 2015-06-28 00:
Upvotes: 0
Views: 201
Reputation: 5469
Thanks for your help @Pholochtairze and @a_horse_with_no_name.
Final query:
WITH merged_history AS (
SELECT item_id, SUM(history_volume) AS history_volume,
(SUM(history_medianprice * history_volume) / SUM(history_volume)) AS history_medianprice,
history_timestamp::date
FROM prices_history
WHERE history_timestamp < NOW()::date - INTERVAL '1 month'
GROUP BY item_id, history_timestamp::date),
upsert AS (
UPDATE prices_history AS h
SET history_volume = mh.history_volume, history_medianprice = mh.history_medianprice
FROM merged_history AS mh
WHERE h.item_id = mh.item_id AND h.history_timestamp = mh.history_timestamp RETURNING h.*)
INSERT INTO prices_history
SELECT item_id, history_volume, history_medianprice, history_timestamp
FROM merged_history
WHERE NOT EXISTS (
SELECT 1 FROM upsert AS u, merged_history AS mh
WHERE u.item_id = mh.item_id AND u.history_timestamp = mh.history_timestamp);
DELETE FROM prices_history
WHERE history_timestamp != history_timestamp::date
AND history_timestamp < NOW()::date - INTERVAL '1 month';
(It's took 2 minutes to complete this chalenge (3m rows) :D I will run it once a week. It will be a bit modified later, coz no need to merge already merged data)
Upvotes: 0
Reputation: 1854
Since you want to modify the table with this new data, I thought of a select into. Here is how I proceeded (see sql fiddle here) :
Here is the code :
/*Step 1 : Select the data you need and insert it in a temporary table*/
SELECT
ID,
SUM(volume) as volume,
timestamp_field::date
INTO foo_temp
FROM foo
GROUP BY
ID,
timestamp_field::date
ORDER BY
ID,
timestamp_field::date;
/*Step 2 : Delete data from the table*/
DELETE FROM foo;
/*Step3 : Take data from the temporary table and insert it into the "main" table*/
INSERT INTO foo(ID,volume,timestamp_field)
SELECT * FROM foo_temp;
/*Step 4: Drop the temporary table*/
DROP TABLE foo_temp;
/*Step 5 : Yeah it works !*/
SELECT * FROM foo;
I must admit @a_horse_with_no_name did most of the job and his answer was elegant.
Note : There is probably a better way of doing the job.
Upvotes: 1
Reputation:
Looks like a simply group by based on the date without the time:
select id,
sum(volume) as volume,
timestamp::date as timestamp
from the_table
group by id, timestamp::date
order by id, timestamp::date;
timestamp::date
will cast the column named timestamp
(which btw is a horrible name for a column) to a date
thus removing the time part of the timestamp (the data type).
timestamp::date
is Postgres specific. The ANSI SQL equivalent is cast(timestamp as date)
(did I mention that timestamp is a horrible name for a column?)
Upvotes: 4