Reputation: 493
I have a input redshift dataset like the one below:
car_id | driver_id | salesman | tyre_id | price_min | price_max | price_avg
-------+-----------+----------+---------+-----------+-----------+----------
A | 1 | 1 | 9 | 1 | 1 | 1
A | 2 | 1 | 9 | 1 | 1 | 1
A | 3 | 1 | 9 | 1 | 1 | 1
A | 4 | 1 | 9 | 1 | 1 | 1
A | 1 | 2 | 7 | 0 | 1 | 1
A | 2 | 2 | 7 | 0 | 0 | 1
A | 3 | 2 | 7 | 0 | 1 | 1
A | 4 | 2 | 7 | 0 | 0 | 0
I would like to pivot multiple columns whose values may be dynamic in nature. I want to pivot the columns salesman, tyre_id with the values price_min, price_avg, price_max. Basically, each driver_id will have 1 single row.
I can achieve it in python but I need a query as my requirement.
I tried concatenating salesman and tyreId. It partially works but I need help from the expert community for this to achieve fully. Can anyone help me and guide me through this?
This is the query I tried:
CREATE TABLE public.temp_car_id
(
car_id VARCHAR(10),
driver_id INT,
salesman INT,
tyre_id INT,
price_min INT,
price_max INT,
price_avg INT
);
INSERT INTO public.temp_car_id
VALUES
('A',1,1,9,0.61,0.89,0.91),
('A',2,1,9,0.63,0.93,0.58),
('A',3,1,9,0.91,0.83,0.99),
('A',4,1,9,0.53,0.84,0.79),
('A',1,2,7,0.12,0.97,0.87),
('A',2,2,7,0.13,0.30,0.84),
('A',3,2,7,0.17,0.62,0.63),
('A',4,2,7,0.09,0.01,0.19);
SELECT car_id,
driver_id,
CONCAT('.',CONCAT(CONCAT(salesman,'.'), tyre_id)) AS sales_tyre,
price_min,
price_max,
price_avg INTO #temp_car_id_unpivot
FROM public.temp_car_id;
SELECT *
FROM #temp_car_id_unpivot PIVOT (MAX(price_min)
FOR
sales_tyre IN ('.1.9','.2.7'));
Upvotes: 0
Views: 4402
Reputation: 11032
For the example you give I have some more general SQL below but this is for the static case you put in your example. Not sure if you are looking for a general / dynamic pivot functionality but if you are you will need to create some sort of SQL generator that takes the column value you want to pivot as input. This generator can be Redshift external code or a stored procedure but cannot be done in basic SQL on Redshift. The code below can be used as a template for use with such a SQL generator.
Set up (you had some decimal values defined as INT):
CREATE TABLE public.temp_car_id
(
car_id VARCHAR(10),
driver_id INT,
salesman INT,
tyre_id INT,
price_min decimal(8,2),
price_max decimal(8,2),
price_avg decimal(8,2)
);
INSERT INTO public.temp_car_id
VALUES
('A',1,1,9,0.61,0.89,0.91),
('A',2,1,9,0.63,0.93,0.58),
('A',3,1,9,0.91,0.83,0.99),
('A',4,1,9,0.53,0.84,0.79),
('A',1,2,7,0.12,0.97,0.87),
('A',2,2,7,0.13,0.30,0.84),
('A',3,2,7,0.17,0.62,0.63),
('A',4,2,7,0.09,0.01,0.19);
SQL code:
select car_id, driver_id,
min(avg_price_min_s1_t9) as avg_price_min_s1_t9,
min(avg_price_max_s1_t9) as avg_price_max_s1_t9,
min(avg_price_avg_s1_t9) as avg_price_mavg_s1_t9,
min(avg_price_min_s2_t7) as avg_price_min_s2_t7,
min(avg_price_max_s2_t7) as avg_price_max_s2_t7,
min(avg_price_avg_s2_t7) as avg_price_mavg_s2_t7
from (
SELECT car_id,
driver_id,
avg(case when tyre_id = 9 and salesman = 1 then price_min end) over( partition by car_id, driver_id, salesman, tyre_id)
as avg_price_min_s1_t9,
avg(case when tyre_id = 9 and salesman = 1 then price_max end) over( partition by car_id, driver_id, salesman, tyre_id)
as avg_price_max_s1_t9,
avg(case when tyre_id = 9 and salesman = 1 then price_avg end) over( partition by car_id, driver_id, salesman, tyre_id)
as avg_price_avg_s1_t9,
avg(case when tyre_id = 7 and salesman = 2 then price_min end) over( partition by car_id, driver_id, salesman, tyre_id)
as avg_price_min_s2_t7,
avg(case when tyre_id = 7 and salesman = 2 then price_max end) over( partition by car_id, driver_id, salesman, tyre_id)
as avg_price_max_s2_t7,
avg(case when tyre_id = 7 and salesman = 2 then price_avg end) over( partition by car_id, driver_id, salesman, tyre_id)
as avg_price_avg_s2_t7
FROM public.temp_car_id ) a
group by 1,2
order by 1,2;
It is unclear what aggregate function you want to use for combining these values as you only have 1 value per type so I used avg().
Upvotes: 1