Reputation: 377
I've got a table that looks like this:
player_id | violation
---------------------
1 | A
1 | A
1 | B
2 | C
3 | D
3 | A
And I want to turn it into this, with a bunch of new columns that refer to the types of violations, and then the sum of the number of each individual type of violation that each player got (not that concerned with what the columns are called; a/b/c/d would work great as well):
player_id | violation_a | violation_b | violation_c | violation_d
-----------------------------------------------------------------
1 | 2 | 1 | 0 | 0
2 | 0 | 0 | 1 | 0
3 | 1 | 0 | 0 | 1
I know how I could do this, but it would take a ton of lines of code, since there are in reality 100+ types of violations. Is there any way (perhaps with a tablefunc()
?) that I could do this more concisely than spelling out each of the new 100+ columns that I want and the logic for them each individually?
Upvotes: 1
Views: 59
Reputation: 19613
In pure SQL I don't see how you could avoid declaring the columns yourself. You either have to create subselects or filters in every column ..
SELECT DISTINCT ON (t.player_id)
t.player_id,
count(*) FILTER (WHERE violation = 'A') AS violation_a,
count(*) FILTER (WHERE violation = 'B') AS violation_b,
count(*) FILTER (WHERE violation = 'C') AS violation_c,
count(*) FILTER (WHERE violation = 'D') AS violation_d
FROM t
GROUP BY t.player_id;
.. or create a pivot table:
SELECT *
FROM crosstab(
'SELECT player_id, t2.violation, count(*) FILTER (WHERE t.violation = t2.violation)::INT
FROM t,(SELECT DISTINCT violation FROM t) t2
GROUP BY player_id, t2.violation'
) AS ct(player_id INT,violation_a int,violation_b int,violation_c int,violation_d int);
Demo: db<>fiddle
Upvotes: 1