Reputation: 1
Any way to expand rows to columns in presto efficiently?
I've tried to filter the raw dataset with 'where team = 1' and 'where team = 2' separately, to get the dataset1 and dataset 2 correspondingly first, and then join the two datasets on income_level. However it is inconvenient when income_level has too many different values. Is there any efficient way to get the result I want?
Upvotes: 0
Views: 7172
Reputation: 694
Prestodb offers a map_agg
function that can help to convert your long data to the wide format you are looking for. Unfortunately there doesn't seem to be a way to dynamically created the column names, but this approach should be far more efficient (and less typing :) ) than joining on each team.
WITH raw_data AS (
SELECT 1 AS team, 'a' AS income_level, 1 AS time, 11 AS ord
UNION
SELECT 1 AS team, 'b' AS income_level, 2 AS time, 12 AS ord
UNION
SELECT 1 AS team, 'c' AS income_level, 3 AS time, 13 AS ord
UNION
SELECT 2 AS team, 'a' AS income_level, 4 AS time, 14 AS ord
UNION
SELECT 2 AS team, 'b' AS income_level, 5 AS time, 15 AS ord
UNION
SELECT 2 AS team, 'c' AS income_level, 6 AS time, 16 AS ord
UNION
SELECT 3 AS team, 'a' AS income_level, 7 AS time, 17 AS ord
UNION
SELECT 3 AS team, 'b' AS income_level, 8 AS time, 18 AS ord
UNION
SELECT 3 AS team, 'c' AS income_level, 9 AS time, 19 AS ord
)
SELECT
income_level,
team_time[1] AS time_1,
team_ord[1] AS ord_1,
team_time[2] AS time_2,
team_ord[2] AS ord_2,
team_time[3] AS time_3,
team_ord[3] AS ord_3
FROM (
SELECT
income_level,
map_agg(team, time) AS team_time,
map_agg(team, ord) AS team_ord
FROM raw_data
GROUP BY income_level
);
Output:
| income_level | time_1 | ord_1 | time_2 | ord_2 | time_3 | ord_3 |
|--------------|--------|-------|--------|-------|--------|-------|
| a | 1 | 11 | 4 | 14 | 7 | 17 |
| b | 2 | 12 | 5 | 15 | 8 | 18 |
| c | 3 | 13 | 6 | 16 | 9 | 19 |
This site provides another example of how this can be done.
Upvotes: 2