PostgresQL: Statement help needed

Question

Consider a table with the following structure of a PostgresQL database table (stripped down here for demonstration purpose):

time:  Timestamp
type:  Text
value: Integer

The table gets filled by many events over time. Now, we need an SQL staemtent for reporting, which aggregates the values by time (e.g. every hour) and averaging and counting them for a specific type. So the report looks like that:

[example report for type="standard"]  
Time    Count   Avg   
00:00   30      20
01:00   12      24
02:00   9       19
...

Until here it's pretty straight-forward, so the statement for the above report is:

select extract(hour from time) time, count(1), avg(value)
from reportdata
where type = 'standard'
group by time;

Now comes the tricky part - I need to show a report per type with a total count of all types and a percentage in relation to the specific type in each time frame. For that, I'd need a statement, which produces a line for each time frame and each possible type (which can be selected from a separate table containing all possible types) from which the code can then extract the report tab for each type without requesting from the database again. So the result should look like that (note the "empty" lines for a time frame where no values for the type were found):

[example report for all types assuming there are 3 possible types]  
Time    Type       Total   Count   Percent   Avg   
00:00   standard   40      30      0.75      20
00:00   special    40      10      0.25      8
00:00   super      40      0       0         0
01:00   standard   12      12      1.0       24
01:00   special    12      0       0         0
01:00   super      12      0       0         0
02:00   standard   9       3       0.33      20
02:00   special    9       0       0         0
02:00   super      9       6       0.67      15
...

How would the statement look like to produce that?

Clodoaldo Neto · Accepted Answer

select 
    s.hour as "Time", 
    s.type as "Type", 
    s.total as "Total", 
    coalesce(r.total, 0) as "Count", 
    round(coalesce(r.total, 0) * 1.0/s.total, 2) as "Percent", 
    round(coalesce(r.avg, 0), 2) as "Avg"
from (
    select 
        date_trunc('hour', time) as hour, 
        type, 
        count(*) as total, 
        avg(value) as avg
    from reportdata
    group by hour, type
    ) r
right outer join (
    select 
        date_trunc('hour', time) as hour,
        t.type,
        count(*) as total
    from reportdata
    inner join type t on true
    group by hour, t.type
    ) s on s.hour = r.hour and s.type = r.type
order by s.hour, s.type
;
        Time         |   Type   | Total | Count | Percent |  Avg  
---------------------+----------+-------+-------+---------+-------
 2012-04-02 00:00:00 | special  |    40 |    10 |    0.25 |  8.00
 2012-04-02 00:00:00 | standard |    40 |    30 |    0.75 | 20.00
 2012-04-02 00:00:00 | super    |    40 |     0 |    0.00 |  0.00
 2012-04-02 01:00:00 | special  |    12 |     0 |    0.00 |  0.00
 2012-04-02 01:00:00 | standard |    12 |    12 |    1.00 | 24.00
 2012-04-02 01:00:00 | super    |    12 |     0 |    0.00 |  0.00
 2012-04-02 02:00:00 | special  |     9 |     0 |    0.00 |  0.00
 2012-04-02 02:00:00 | standard |     9 |     3 |    0.33 | 20.00
 2012-04-02 02:00:00 | super    |     9 |     6 |    0.67 | 15.00
(9 rows)

I used date_trunc on the timestamp because I think what you want is to segregate each hour of each day. If aggregating each hour of all days is really what you want just revert to your extract

Update to match the new requirement in comment:

select 
    s.hour as "Time", 
    s.type as "Type", 
    s.total as "Total", 
    coalesce(r.total, 0) as "Count", 
    case s.total when 0 then round(0, 2) else
        round(coalesce(r.total, 0) * 1.0/s.total, 2)
        end as "Percent", 
    round(coalesce(r.avg, 0), 2) as "Avg"
from (
    select 
        date_trunc('hour', time) as hour, 
        type, 
        count(*) as total, 
        avg(value) as avg
    from reportdata
    group by hour, type
) r
right outer join (
    select 
        date_trunc('hour', d) as hour,
        t.type,
        count(r.time) as total
    from reportdata r
    right outer join (
        select d 
        from generate_series(
            (select min(time) from reportdata),
            (select max(time) from reportdata),
            '1 hour'
        ) g(d)
    ) g on date_trunc('hour', g.d) = date_trunc('hour', r.time)
    inner join type t on true
    group by hour, t.type
) s on s.hour = r.hour and s.type = r.type
order by s.hour, s.type
;
        Time         |   Type   | Total | Count | Percent |  Avg  
---------------------+----------+-------+-------+---------+-------
 2012-04-01 22:00:00 | special  |     1 |     0 |    0.00 |  0.00
 2012-04-01 22:00:00 | standard |     1 |     1 |    1.00 | 10.00
 2012-04-01 22:00:00 | super    |     1 |     0 |    0.00 |  0.00
 2012-04-01 23:00:00 | special  |     0 |     0 |    0.00 |  0.00
 2012-04-01 23:00:00 | standard |     0 |     0 |    0.00 |  0.00
 2012-04-01 23:00:00 | super    |     0 |     0 |    0.00 |  0.00
 2012-04-02 00:00:00 | special  |    40 |    10 |    0.25 |  8.00
 2012-04-02 00:00:00 | standard |    40 |    30 |    0.75 | 20.00
 2012-04-02 00:00:00 | super    |    40 |     0 |    0.00 |  0.00
 2012-04-02 01:00:00 | special  |    12 |     0 |    0.00 |  0.00
 2012-04-02 01:00:00 | standard |    12 |    12 |    1.00 | 24.00
 2012-04-02 01:00:00 | super    |    12 |     0 |    0.00 |  0.00
 2012-04-02 02:00:00 | special  |     9 |     0 |    0.00 |  0.00
 2012-04-02 02:00:00 | standard |     9 |     3 |    0.33 | 20.00
 2012-04-02 02:00:00 | super    |     9 |     6 |    0.67 | 15.00
 2012-04-02 03:00:00 | special  |     0 |     0 |    0.00 |  0.00
 2012-04-02 03:00:00 | standard |     0 |     0 |    0.00 |  0.00
 2012-04-02 03:00:00 | super    |     0 |     0 |    0.00 |  0.00
 2012-04-02 04:00:00 | special  |     1 |     0 |    0.00 |  0.00
 2012-04-02 04:00:00 | standard |     1 |     1 |    1.00 | 10.00
 2012-04-02 04:00:00 | super    |     1 |     0 |    0.00 |  0.00
(21 rows)

PostgresQL: Statement help needed

Answers (2)

Update to match the new requirement in comment:

Related Questions