Reputation: 97
I have the following table:
crit_id | criterium | val1 | val2
----------+------------+-------+--------
1 | T01 | 9 | 9
2 | T02 | 3 | 5
3 | T03 | 4 | 9
4 | T01 | 2 | 3
5 | T02 | 5 | 1
6 | T03 | 6 | 1
I need to convert the values in 'criterium' into columns as 'cross product' with val1 and val2. So the result has to lool like:
T01_val1 |T01_val2 |T02_val1 |T02_val2 | T03_val1 | T03_val2
---------+---------+---------+---------+----------+---------
9 | 9 | 3 | 5 | 4 | 9
2 | 3 | 5 | 1 | 6 | 1
Or to say differently: I need every value for all criteria to be in one row.
This is my current approach:
select
case when criterium = 'T01' then val1 else null end as T01_val1,
case when criterium = 'T01' then val2 else null end as T01_val2,
case when criterium = 'T02' then val1 else null end as T02_val1,
case when criterium = 'T02' then val2 else null end as T02_val2,
case when criterium = 'T03' then val1 else null end as T03_val1,
case when criterium = 'T03' then val2 else null end as T04_val2,
from crit_table;
But the result looks not how I want it to look like:
T01_val1 |T01_val2 |T02_val1 |T02_val2 | T03_val1 | T03_val2
---------+---------+---------+---------+----------+---------
9 | 9 | null | null | null | null
null | null | 3 | 5 | null | null
null | null | null | null | 4 | 9
What's the fastest way to achieve my goal?
Bonus question:
I have 77 criteria and seven different kinds of values for every criterium. So I have to write 539 case
statements. Whats the best way to create them dynamically?
I'm working with PostgreSql 9.4
Upvotes: 3
Views: 763
Reputation: 121604
In order to use crosstab()
function, the data must be reorganized. You need a dataset with three columns (row number
, criterium
, value
). To have all values in one column you must unpivot two last columns, changing at the same time the names of criteria. As a row number you can use rank()
function over partitions by new criteria.
select rank() over (partition by criterium order by crit_id), criterium, val
from (
select crit_id, criterium || '_v1' criterium, val1 val
from crit
union
select crit_id, criterium || '_v2' criterium, val2 val
from crit
) sub
order by 1, 2
rank | criterium | val
------+-----------+-----
1 | T01_v1 | 9
1 | T01_v2 | 9
1 | T02_v1 | 3
1 | T02_v2 | 5
1 | T03_v1 | 4
1 | T03_v2 | 9
2 | T01_v1 | 2
2 | T01_v2 | 3
2 | T02_v1 | 5
2 | T02_v2 | 1
2 | T03_v1 | 6
2 | T03_v2 | 1
(12 rows)
This dataset can be used in crosstab()
:
create extension if not exists tablefunc;
select * from crosstab($ct$
select rank() over (partition by criterium order by crit_id), criterium, val
from (
select crit_id, criterium || '_v1' criterium, val1 val
from crit
union
select crit_id, criterium || '_v2' criterium, val2 val
from crit
) sub
order by 1, 2
$ct$)
as ct (rank bigint, "T01_v1" int, "T01_v2" int,
"T02_v1" int, "T02_v2" int,
"T03_v1" int, "T03_v2" int);
rank | T01_v1 | T01_v2 | T02_v1 | T02_v2 | T03_v1 | T03_v2
------+--------+--------+--------+--------+--------+--------
1 | 9 | 9 | 3 | 5 | 4 | 9
2 | 2 | 3 | 5 | 1 | 6 | 1
(2 rows)
For 77 criteria * 7 parameters the above query may be troublesome. If you can accept a bit different way of presenting the data, the issue becomes much easier.
select * from crosstab($ct$
select
rank() over (partition by criterium order by crit_id),
criterium,
concat_ws(' | ', val1, val2) vals
from crit
order by 1, 2
$ct$)
as ct (rank bigint, "T01" text, "T02" text, "T03" text);
rank | T01 | T02 | T03
------+-------+-------+-------
1 | 9 | 9 | 3 | 5 | 4 | 9
2 | 2 | 3 | 5 | 1 | 6 | 1
(2 rows)
Upvotes: 3
Reputation: 871
I agree with Michael's comment that this requirement looks a bit weird, but if you really need it that way, you were on the right track with your solution. It just needs a little bit of additional code (and small corrections wherever val_1 and val_2 where mixed up):
select
sum(case when criterium = 'T01' then val_1 else null end) as T01_val1,
sum(case when criterium = 'T01' then val_2 else null end) as T01_val2,
sum(case when criterium = 'T02' then val_1 else null end) as T02_val1,
sum(case when criterium = 'T02' then val_2 else null end) as T02_val2,
sum(case when criterium = 'T03' then val_1 else null end) as T03_val1,
sum(case when criterium = 'T03' then val_2 else null end) as T03_val2
from
crit_table
group by
trunc((crit_id-1)/3.0)
order by
trunc((crit_id-1)/3.0);
This works as follows. To aggregate the result you posted into the result you would like to have, the first helpful observation is that the desired result has less rows than your preliminary one. So there's some kind of grouping necessary, and the key question is: "What's the grouping criterion?" In this case, it's rather non-obvious: It's criterion ID (minus 1, to start counting with 0) divided by 3, and truncated. The three comes from the number of different criteria. After that puzzle is solved, it is easy to see that for among the input rows that are aggregated into the same result row, there is only one non-null value per column. That means that the choice of aggregate function is not so important, as it is only needed to return the only non-null value. I used the sum in my code snippet, but you could as well use min or max.
As for the bonus question: Use a code generator query that generates the query you need. The code looks like this (with only three types of values to keep it brief):
with value_table as /* possible kinds of values, add the remaining ones here */
(select 'val_1' value_type union
select 'val_2' value_type union
select 'val_3' value_type )
select contents from (
select 0 order_id, 'select' contents
union
select row_number() over () order_id,
'max(case when criterium = '''||criterium||''' then '||value_type||' else null end) '||criterium||'_'||value_type||',' contents
from crit_table
cross join value_table
union select 9999999 order_id,
' from crit_table group by trunc((crit_id-1)/3.0) order by trunc((crit_id-1)/3.0);' contents
) v
order by order_id;
This basically only uses a string template of your query and then inserts the appropriate combinations of values for the criteria and the val-columns. You could even get rid of the with-clause by reading column names from information_schema.columns, but I think the basic idea is clearer in the version above. Note that the code generated contains one comma too much directly after the last column (before the from clause). It's easier to delete that by hand afterwards than correcting it in the generator.
Upvotes: 0
Reputation: 8865
DECLARE @Table1 TABLE
(crit_id int, criterium varchar(3), val1 int, val2 int)
;
INSERT INTO @Table1
(crit_id, criterium, val1, val2)
VALUES
(1, 'T01', 9, 9),
(2, 'T02', 3, 5),
(3, 'T03', 4, 9),
(4, 'T01', 2, 3),
(5, 'T02', 5, 1),
(6, 'T03', 6, 1)
;
select [T01] As [T01_val1 ],[T01-1] As [T01_val2 ],[T02] As [T02_val1 ],[T02-1] As [T02_val2 ],[T03] As [T03_val1 ],[T03-1] As [T03_val3 ] from (
select T.criterium,T.val1,ROW_NUMBER()OVER(PARTITION BY T.criterium ORDER BY (SELECT NULL)) RN from (
select criterium, val1 from @Table1
UNION ALL
select criterium+'-'+'1', val2 from @Table1)T)PP
PIVOT (MAX(val1) FOR criterium IN([T01],[T02],[T03],[T01-1],[T02-1],[T03-1]))P
Upvotes: 0