user3262353
user3262353

Reputation: 107

Group and calculate percentages of each distinct value

Postgresql 9.6 with two tables:

      id  | name  |
   -------+-------+
      1   | Mars  |
      2   | Pluto |
      3   | Moon  |
      4   | Venus |


     id | p_id | action |
   -------+-------+-----+
     1  |  1   | LANDED |
     2  |  1   | UNSEEN |
     3  |  1   | SEEN   |
     4  |  1   | SEEN   |
     5  |  2   | LANDED |
     6  |  3   | SEEN   |
     7  |  3   | SEEN   |
     8  |  3   | UNSEEN |
     9  |  3   | LANDED |
    10  |  3   | LANDED |
    11  |  3   | LANDED |

I cannot figure out a query to get a table who show percentages for each actions,
like that for example:

      p_id | name  | SEEN | UNSEEN | LANDED |
   --------+-------+------+--------+--------+
       1   | Mars  |  10% |   30%  |   60%  |
       2   | Pluto |   0% |    0%  |  100%  |
       3   | Moon  |  25% |   35%  |   30%  |
       4   | Venus |   0% |    0%  |    0%  |

Any help at all would be greatly appreciated.
Thanks,
Peraz

Upvotes: 3

Views: 1274

Answers (3)

Ivan Burlutskiy
Ivan Burlutskiy

Reputation: 1623

You can use window function to calculate percentages and crosstab to move rows to columns. Please, check this one:

create table body(id int, name varchar(256));

insert into body(id, name) values ( 1, 'Mars' ), ( 2, 'Pluto' ), (3, 'Moon' ), ( 4, 'Venus');

create table actions(id int, p_id int, action varchar(256));

insert into actions (id, p_id, action) values
( 1, 1, 'LANDED'),
( 2, 1, 'UNSEEN'),
( 3, 1, 'SEEN'),
( 4, 1, 'SEEN'),
( 5, 2, 'LANDED'),
( 6, 3, 'SEEN'),
( 7, 3, 'SEEN'),
( 8, 3, 'UNSEEN'),
( 9, 3, 'LANDED'),
(10, 3, 'LANDED'),
(11, 3, 'LANDED');


SELECT *
FROM crosstab($$
    select p_id as id, action, ((times / sum(times) over (partition by p_id)) * 100)::float as percentage
    from (
        select action, p_id, count(*) as times
        from actions 
        group by p_id, action
    )x   
    order by 1, 2
  $$
) as percentage ("id" int, "LANDED" float, "SEEN" float, "UNSEEN" float);

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269793

I would do this using avg():

select n.id, n.name, 
       avg( (action = 'SEEN')::int ) as seen,
       avg( (action = 'UNSEEN')::int ) as unseen,
       avg( (action = 'LANDED')::int ) as landed
from names n left join
     actions a
     on a.p_id = n.id
group by n.id, n.name;

This produces the values as ratios. Formatting them as strings with % seems more appropriate for an application layer.

Upvotes: 1

klin
klin

Reputation: 121604

You can calculate the number of instances of each category using count(*) with filter:

select 
    n.id, name, 
    count(*) filter (where action = 'SEEN') as seen,
    count(*) filter (where action = 'UNSEEN') as unseen,
    count(*) filter (where action = 'LANDED') as landed,
    count(*)::dec as total
from names n
left join actions a on a.p_id = n.id
group by n.id
order by n.id;

 id | name  | seen | unseen | landed | total 
----+-------+------+--------+--------+-------
  1 | Mars  |    2 |      1 |      1 |     4
  2 | Pluto |    0 |      0 |      1 |     1
  3 | Moon  |    2 |      1 |      3 |     6
  4 | Venus |    0 |      0 |      0 |     1
(4 rows)

Total for Venus is incorrect (1) because of left join. In fact, it's ok as we can avoid dividing by 0 in the next step.

Use the above query in a derived table (or in CTE) to calculate percentages:

select
    id, name,
    round(seen/ total* 100, 0) as seen,
    round(unseen/ total* 100, 0) as unseen,
    round(landed/ total* 100, 0) as landed
from (
    select 
        n.id, name, 
        count(*) filter (where action = 'SEEN') as seen,
        count(*) filter (where action = 'UNSEEN') as unseen,
        count(*) filter (where action = 'LANDED') as landed,
        count(*)::dec as total
    from names n
    left join actions a on a.p_id = n.id
    group by n.id
    ) s
order by id;

 id | name  | seen | unseen | landed 
----+-------+------+--------+--------
  1 | Mars  |   50 |     25 |     25
  2 | Pluto |    0 |      0 |    100
  3 | Moon  |   33 |     17 |     50
  4 | Venus |    0 |      0 |      0
(4 rows)    

Upvotes: 2

Related Questions