Reputation: 175
I've got a PostgreSQL table which I have to transpose.
The original data looks like:
user_id role_id assigned
P0001 SA t
P0001 Guest t
P0002 SA f
P0002 Guest t
After transposing, it should look like as follows:
user_id SA Guest
P0001 t t
P0002 f t
The following is what I used to generate the transposed table:
SELECT user_id,
CAST(CAST(CASE role_id WHEN 'SA' THEN 1 ELSE 0 END) AS int) AS boolean) AS "SA",
CAST(CAST((CASE role_id WHEN 'Guest' THEN 1 ELSE 0 END) AS int) AS boolean) AS "Guest"
FROM user_to_roles GROUP BY user_id
But it looks ugly. Is there a way to write an custom aggregation function doing nothing but returning the original value. So the above statement can be re-written as:
SELECT user_id,
Do_Nothing(CASE role_id WHEN 'SA' THEN true ELSE false END) AS "SA",
Do_Nothing(CASE role_id WHEN 'Guest' THEN true ELSE false END) AS "Guest"
FROM user_to_roles GROUP BY user_id
Actually, I need a dynamic generated columns and implemented it using the cursor by the help of this article. And I looked into the documentation and write a custom aggregation like this:
CREATE AGGREGATE do_nothing(anyelement) (
SFUNC=donothing,
STYPE=anyelement,
INITCOND=''
);
CREATE OR REPLACE FUNCTION donothing(anyelement var)
RETURNS anyelement AS
$BODY$
RETURN var;
$BODY$
LANGUAGE sql IMMUTABLE
COST 100;
Since I want to make it generic, anyelment
was used here.
But the above code doesn't work at all.
Upvotes: 1
Views: 805
Reputation: 656566
crosstab()
SELECT * FROM crosstab(
'SELECT user_id, role_id, assigned
FROM tbl
ORDER BY 1,2'
,$$VALUES ('SA'::text), ('Guest')$$)
AS ct (user_id text, "SA" bool, "Guest" bool);
That's as fast as it gets with current versions of Postgres. (Faster than alternatives below.)
But neither of these solutions is completely "dynamic", which is not completely possible:
For a completely dynamic solution, you would have to output arrays or some other container type to return a variable number of columns. Recent related question with answers:
CASE
expressionsFor just a few columns: simple CASE
statements combined with an aggregate function.
min()
or max()
work for most basic types ... per documentation:
any array, numeric, string, or date/time type
But not for boolean
. Use bool_or()
like @Richard suggested.
FALSE OR NULL
would yield NULL
, but all mentioned aggregate functions (incl. bool_or()
) ignore NULL
, so it works with NULL
as well.
SELECT user_id
,bool_or(CASE WHEN role_id = 'SA' THEN assigned ELSE NULL END) AS "SA"
,bool_or(CASE WHEN role_id = 'Guest' THEN assigned ELSE NULL END) AS "Guest"
FROM tbl
GROUP BY 1
ORDER BY 1;
You have to adapt to the data type expected.
There is no standard aggregate function that works for any data type in the way you need it. You can create one:
CREATE OR REPLACE FUNCTION f_echo1(anyelement,anyelement)
RETURNS anyelement AS 'SELECT $1' LANGUAGE sql IMMUTABLE STRICT;
CREATE AGGREGATE first_notnull(anyelement) (
SFUNC = f_echo1
, STYPE = anyelement
);
This is a simple, working version of what you outlined in your edit:
Starting with NULL
(no INITCOND
), using a STRICT
transition function, it takes the first not-null value and the transition function just keeps that one. So we get the first not-null value back the aggregate function encounters.
This works for any input type:
SELECT user_id
,first_notnull(CASE WHEN role_id = 'SA' THEN assigned ELSE NULL END) AS "SA"
,first_notnull(CASE WHEN role_id = 'Guest' THEN assigned ELSE NULL END) AS "Guest"
FROM tbl
GROUP BY 1
ORDER BY 1;
bool_or()
as max()
Custom aggregate functions in sql or plpgsql are considerably slower than the built-in ones, which are written in C and optimized. You could write one in C yourself. Wouldn't be that hard. Or you can copy the definition of bool_or
under the name of max
in pg_proc
and reference it in another copy in pg_aggregate
to make max()
work for boolean
as well. Messing with catalog tables is not recommended for the casual user, though. If you break something your system might go belly-up ...
Upvotes: 1
Reputation: 22893
In your particular case you can get away with using a built-in aggregate bool_or. Something like:
SELECT user_id,
bool_or(CASE role_id WHEN 'SA' THEN assigned ELSE false END) AS "SA",
bool_or(CASE role_id WHEN 'Guest' THEN assigned ELSE false END) AS "Guest"
FROM user_to_roles GROUP BY user_id
Upvotes: 1