Reputation: 145
In had a rather large error in my previous question
select earliest date from multiple rows
The answer by horse_with_no_name returns a perfect result, and I am hugely appreciative, however I got my own initial question wrong so I really apologise; if you look at the table below;
circuit_uid |customer_name |rack_location |reading_date | reading_time | amps | volts | kw | kwh | kva | pf | key -------------------------------------------------------------------------------------------------------------------------------------- cu1.cb1.r1 | Customer 1 | 12.01.a1 | 2012-01-02 | 00:01:01 | 4.51 | 229.32 | 1.03 | 87 | 1.03 | 0.85 | 15 cu1.cb1.r1 | Customer 1 | 12.01.a1 | 2012-01-02 | 01:01:01 | 4.18 | 230.3 | 0.96 | 90 | 0.96 | 0.84 | 16 cu1.cb1.r2 | Customer 1 | 12.01.a1 | 2012-01-02 | 00:01:01 | 4.51 | 229.32 | 1.03 | 21 | 1.03 | 0.85 | 15 cu1.cb1.r2 | Customer 1 | 12.01.a1 | 2012-01-02 | 01:01:01 | 4.18 | 230.3 | 0.96 | 23 | 0.96 | 0.84 | 16 cu1.cb1.s2 | Customer 2 | 10.01.a1 | 2012-01-02 | 00:01:01 | 7.34 | 228.14 | 1.67 | 179 | 1.67 | 0.88 | 24009 cu1.cb1.s2 | Customer 2 | 10.01.a1 | 2012-01-02 | 01:01:01 | 9.07 | 228.4 | 2.07 | 182 | 2.07 | 0.85 | 24010 cu1.cb1.s3 | Customer 2 | 10.01.a1 | 2012-01-02 | 00:01:01 | 7.34 | 228.14 | 1.67 | 121 | 1.67 | 0.88 | 24009 cu1.cb1.s3 | Customer 2 | 10.01.a1 | 2012-01-02 | 01:01:01 | 9.07 | 228.4 | 2.07 | 124 | 2.07 | 0.85 | 24010 cu1.cb1.r1 | Customer 3 | 01.01.a1 | 2012-01-02 | 00:01:01 | 7.32 | 229.01 | 1.68 | 223 | 1.68 | 0.89 | 48003 cu1.cb1.r1 | Customer 3 | 01.01.a1 | 2012-01-02 | 01:01:01 | 6.61 | 228.29 | 1.51 | 226 | 1.51 | 0.88 | 48004 cu1.cb1.r4 | Customer 3 | 01.01.a1 | 2012-01-02 | 00:01:01 | 7.32 | 229.01 | 1.68 | 215 | 1.68 | 0.89 | 48003 cu1.cb1.r4 | Customer 3 | 01.01.a1 | 2012-01-02 | 01:01:01 | 6.61 | 228.29 | 1.51 | 217 | 1.51 | 0.88 | 48004
As you can see each customer now has multiple circuits. So the result would now be the sum of each of the earliest kwh readings for each circuit per customer, so the result in this table would be;
customer_name | kwh(sum)
--------------+-----------
customer 1 | 108 (the result of 87 + 21)
customer 2 | 300 (the result of 179 + 121)
customer 3 | 438 (the result of 223 + 215)
There will be more than 2 circuits per customer and the readings can happen at varying times, hence the need for the 'earliest' reading.
Would anybody have any suggestions for the revised question?
PostgreSQL 8.4 on CentOs/Redhat.
Upvotes: 3
Views: 1960
Reputation: 656586
SELECT customer_name, sum(kwh) AS kwh_total
FROM (
SELECT DISTINCT ON (customer_name, circuit_uid)
customer_name, circuit_uid, kwh
FROM readings
WHERE reading_date = '2012-01-02'::date
ORDER BY customer_name, circuit_uid, reading_time
) x
GROUP BY 1
Same as before, just pick the earliest per (customer_name, circuit_uid)
.
Then sum per customer_name
.
A multi-column index like the following will make this very fast:
CREATE INDEX readings_multi_idx
ON readings(reading_date, customer_name, circuit_uid, reading_time);
Upvotes: 2
Reputation:
This is an extension to your original question:
select customer_name,
sum(kwh)
from (
select customer_name,
kwh,
reading_time,
reading_date,
row_number() over (partition by customer_name, circuit_uid order by reading_time) as rn
from readings
where reading_date = date '2012-01-02'
) t
where rn = 1
group by customer_name
Note the new sum()
in the outer query and the changed partition by
definition in the inner query (compared to your previous question) which calculates the first reading for each circuit_uid
now (instead of the first for each customer).
Upvotes: 1