Reputation: 6939
Consider a table with the following structure of a PostgresQL database table (stripped down here for demonstration purpose):
time: Timestamp
type: Text
value: Integer
The table gets filled by many events over time. Now, we need an SQL staemtent for reporting, which aggregates the values by time (e.g. every hour) and averaging and counting them for a specific type. So the report looks like that:
[example report for type="standard"]
Time Count Avg
00:00 30 20
01:00 12 24
02:00 9 19
...
Until here it's pretty straight-forward, so the statement for the above report is:
select extract(hour from time) time, count(1), avg(value)
from reportdata
where type = 'standard'
group by time;
Now comes the tricky part - I need to show a report per type with a total count of all types and a percentage in relation to the specific type in each time frame. For that, I'd need a statement, which produces a line for each time frame and each possible type (which can be selected from a separate table containing all possible types) from which the code can then extract the report tab for each type without requesting from the database again. So the result should look like that (note the "empty" lines for a time frame where no values for the type were found):
[example report for all types assuming there are 3 possible types]
Time Type Total Count Percent Avg
00:00 standard 40 30 0.75 20
00:00 special 40 10 0.25 8
00:00 super 40 0 0 0
01:00 standard 12 12 1.0 24
01:00 special 12 0 0 0
01:00 super 12 0 0 0
02:00 standard 9 3 0.33 20
02:00 special 9 0 0 0
02:00 super 9 6 0.67 15
...
How would the statement look like to produce that?
Upvotes: 3
Views: 140
Reputation: 125424
select
s.hour as "Time",
s.type as "Type",
s.total as "Total",
coalesce(r.total, 0) as "Count",
round(coalesce(r.total, 0) * 1.0/s.total, 2) as "Percent",
round(coalesce(r.avg, 0), 2) as "Avg"
from (
select
date_trunc('hour', time) as hour,
type,
count(*) as total,
avg(value) as avg
from reportdata
group by hour, type
) r
right outer join (
select
date_trunc('hour', time) as hour,
t.type,
count(*) as total
from reportdata
inner join type t on true
group by hour, t.type
) s on s.hour = r.hour and s.type = r.type
order by s.hour, s.type
;
Time | Type | Total | Count | Percent | Avg
---------------------+----------+-------+-------+---------+-------
2012-04-02 00:00:00 | special | 40 | 10 | 0.25 | 8.00
2012-04-02 00:00:00 | standard | 40 | 30 | 0.75 | 20.00
2012-04-02 00:00:00 | super | 40 | 0 | 0.00 | 0.00
2012-04-02 01:00:00 | special | 12 | 0 | 0.00 | 0.00
2012-04-02 01:00:00 | standard | 12 | 12 | 1.00 | 24.00
2012-04-02 01:00:00 | super | 12 | 0 | 0.00 | 0.00
2012-04-02 02:00:00 | special | 9 | 0 | 0.00 | 0.00
2012-04-02 02:00:00 | standard | 9 | 3 | 0.33 | 20.00
2012-04-02 02:00:00 | super | 9 | 6 | 0.67 | 15.00
(9 rows)
I used date_trunc
on the timestamp because I think what you want is to segregate each hour of each day. If aggregating each hour of all days is really what you want just revert to your extract
select
s.hour as "Time",
s.type as "Type",
s.total as "Total",
coalesce(r.total, 0) as "Count",
case s.total when 0 then round(0, 2) else
round(coalesce(r.total, 0) * 1.0/s.total, 2)
end as "Percent",
round(coalesce(r.avg, 0), 2) as "Avg"
from (
select
date_trunc('hour', time) as hour,
type,
count(*) as total,
avg(value) as avg
from reportdata
group by hour, type
) r
right outer join (
select
date_trunc('hour', d) as hour,
t.type,
count(r.time) as total
from reportdata r
right outer join (
select d
from generate_series(
(select min(time) from reportdata),
(select max(time) from reportdata),
'1 hour'
) g(d)
) g on date_trunc('hour', g.d) = date_trunc('hour', r.time)
inner join type t on true
group by hour, t.type
) s on s.hour = r.hour and s.type = r.type
order by s.hour, s.type
;
Time | Type | Total | Count | Percent | Avg
---------------------+----------+-------+-------+---------+-------
2012-04-01 22:00:00 | special | 1 | 0 | 0.00 | 0.00
2012-04-01 22:00:00 | standard | 1 | 1 | 1.00 | 10.00
2012-04-01 22:00:00 | super | 1 | 0 | 0.00 | 0.00
2012-04-01 23:00:00 | special | 0 | 0 | 0.00 | 0.00
2012-04-01 23:00:00 | standard | 0 | 0 | 0.00 | 0.00
2012-04-01 23:00:00 | super | 0 | 0 | 0.00 | 0.00
2012-04-02 00:00:00 | special | 40 | 10 | 0.25 | 8.00
2012-04-02 00:00:00 | standard | 40 | 30 | 0.75 | 20.00
2012-04-02 00:00:00 | super | 40 | 0 | 0.00 | 0.00
2012-04-02 01:00:00 | special | 12 | 0 | 0.00 | 0.00
2012-04-02 01:00:00 | standard | 12 | 12 | 1.00 | 24.00
2012-04-02 01:00:00 | super | 12 | 0 | 0.00 | 0.00
2012-04-02 02:00:00 | special | 9 | 0 | 0.00 | 0.00
2012-04-02 02:00:00 | standard | 9 | 3 | 0.33 | 20.00
2012-04-02 02:00:00 | super | 9 | 6 | 0.67 | 15.00
2012-04-02 03:00:00 | special | 0 | 0 | 0.00 | 0.00
2012-04-02 03:00:00 | standard | 0 | 0 | 0.00 | 0.00
2012-04-02 03:00:00 | super | 0 | 0 | 0.00 | 0.00
2012-04-02 04:00:00 | special | 1 | 0 | 0.00 | 0.00
2012-04-02 04:00:00 | standard | 1 | 1 | 1.00 | 10.00
2012-04-02 04:00:00 | super | 1 | 0 | 0.00 | 0.00
(21 rows)
Upvotes: 2
Reputation:
Something like this?
select extract(hour from rd.time) time,
at.type,
count(at.value) over (partition by extract(hour from time)) as total,
count(at.value) over (partition by rd.type) as count,
avg(value) over (partition by rd.type) as avg,
from all_types at
left join reportdata rd on at.type = rd.type
group by time, at.type;
(all_types
is the "separate table containing all possible types")
Upvotes: 0