Shamil Yakupov
Shamil Yakupov

Reputation: 5469

Merge hourly prices history to daily

Database looks like:

ID | volume | timestamp (timestamp without time zone)
 1 | 300    | 2015-05-27 00:
 1 | 250    | 2015-05-28 00:
 2 | 13     | 2015-05-25 00:
 1 | 500    | 2015-06-28 22:
 1 | 100    | 2015-06-28 23:
 2 | 11     | 2015-06-28 21:
 2 | 15     | 2015-06-28 23:

Is there any way to merge hourly prices history, that oldest than 1 month, to daily and put them back to table? That means merge hourly records into 1 record, with sum volume and timestamp of 00 hour (I mean only day, 2013-08-15 00:00:00).

So, wanted result:

ID | volume | timestamp
 1 | 300    | 2015-05-27 00:
 1 | 250    | 2015-05-28 00:
 2 | 13     | 2015-05-25 00:
 1 | 600    | 2015-06-28 00:
 2 | 26     | 2015-06-28 00:

Upvotes: 0

Views: 201

Answers (3)

Shamil Yakupov
Shamil Yakupov

Reputation: 5469

Thanks for your help @Pholochtairze and @a_horse_with_no_name.
Final query:

WITH merged_history AS (
    SELECT item_id, SUM(history_volume) AS history_volume,
        (SUM(history_medianprice * history_volume) / SUM(history_volume)) AS history_medianprice,
        history_timestamp::date
    FROM prices_history
    WHERE history_timestamp < NOW()::date - INTERVAL '1 month'
    GROUP BY item_id, history_timestamp::date),
upsert AS ( 
    UPDATE prices_history AS h
    SET history_volume = mh.history_volume, history_medianprice = mh.history_medianprice
    FROM merged_history AS mh
    WHERE h.item_id = mh.item_id AND h.history_timestamp = mh.history_timestamp RETURNING h.*)
INSERT INTO prices_history
SELECT item_id, history_volume, history_medianprice, history_timestamp
FROM merged_history
WHERE NOT EXISTS (
    SELECT 1 FROM upsert AS u, merged_history AS mh
    WHERE u.item_id = mh.item_id AND u.history_timestamp = mh.history_timestamp);

DELETE FROM prices_history
WHERE history_timestamp != history_timestamp::date
    AND history_timestamp < NOW()::date - INTERVAL '1 month';

(It's took 2 minutes to complete this chalenge (3m rows) :D I will run it once a week. It will be a bit modified later, coz no need to merge already merged data)

Upvotes: 0

Pholochtairze
Pholochtairze

Reputation: 1854

Since you want to modify the table with this new data, I thought of a select into. Here is how I proceeded (see sql fiddle here) :

  1. Select the data you want and insert it in a temporary table (foo_temp)
  2. Empty your first table (foo)
  3. Copy the data from the temporary table into the (now) empty table (from foo_temp to foo)
  4. Drop the temporary table

Here is the code :

/*Step 1 : Select the data you need and insert it in a temporary table*/
SELECT 
    ID,
    SUM(volume) as volume, 
    timestamp_field::date
INTO foo_temp
FROM foo
GROUP BY 
    ID,
    timestamp_field::date
ORDER BY 
    ID,
    timestamp_field::date;

/*Step 2 : Delete data from the table*/
DELETE FROM foo;

/*Step3 : Take data from the temporary table and insert it into the "main" table*/
INSERT INTO foo(ID,volume,timestamp_field)
SELECT * FROM foo_temp;

/*Step 4: Drop the temporary table*/
DROP TABLE foo_temp;

/*Step 5 : Yeah it works !*/
SELECT * FROM foo;

I must admit @a_horse_with_no_name did most of the job and his answer was elegant.

Note : There is probably a better way of doing the job.

Upvotes: 1

user330315
user330315

Reputation:

Looks like a simply group by based on the date without the time:

select id,
       sum(volume) as volume, 
       timestamp::date as timestamp
from the_table
group by id, timestamp::date
order by id, timestamp::date;

timestamp::date will cast the column named timestamp(which btw is a horrible name for a column) to a date thus removing the time part of the timestamp (the data type).

timestamp::date is Postgres specific. The ANSI SQL equivalent is cast(timestamp as date) (did I mention that timestamp is a horrible name for a column?)

Upvotes: 4

Related Questions