Multiple columns dynamic pivoting in redshift database

Question

I have a input redshift dataset like the one below:

car_id | driver_id | salesman | tyre_id | price_min | price_max | price_avg
-------+-----------+----------+---------+-----------+-----------+----------
A      |         1 |        1 |       9 |         1 |         1 |         1
A      |         2 |        1 |       9 |         1 |         1 |         1
A      |         3 |        1 |       9 |         1 |         1 |         1
A      |         4 |        1 |       9 |         1 |         1 |         1
A      |         1 |        2 |       7 |         0 |         1 |         1
A      |         2 |        2 |       7 |         0 |         0 |         1
A      |         3 |        2 |       7 |         0 |         1 |         1
A      |         4 |        2 |       7 |         0 |         0 |         0

I would like to pivot multiple columns whose values may be dynamic in nature. I want to pivot the columns salesman, tyre_id with the values price_min, price_avg, price_max. Basically, each driver_id will have 1 single row.

I can achieve it in python but I need a query as my requirement.

The is the expected output:

I tried concatenating salesman and tyreId. It partially works but I need help from the expert community for this to achieve fully. Can anyone help me and guide me through this?

This is the query I tried:

CREATE TABLE public.temp_car_id 
(
  car_id      VARCHAR(10),
  driver_id   INT,
  salesman    INT,
  tyre_id     INT,
  price_min   INT,
  price_max   INT,
  price_avg   INT
);

INSERT INTO public.temp_car_id
VALUES
('A',1,1,9,0.61,0.89,0.91),
('A',2,1,9,0.63,0.93,0.58),
('A',3,1,9,0.91,0.83,0.99),
('A',4,1,9,0.53,0.84,0.79),
('A',1,2,7,0.12,0.97,0.87),
('A',2,2,7,0.13,0.30,0.84),
('A',3,2,7,0.17,0.62,0.63),
('A',4,2,7,0.09,0.01,0.19);
     

SELECT car_id,
       driver_id,
       CONCAT('.',CONCAT(CONCAT(salesman,'.'), tyre_id)) AS sales_tyre,
       price_min,
       price_max,
       price_avg INTO #temp_car_id_unpivot
FROM public.temp_car_id;

SELECT *
FROM #temp_car_id_unpivot PIVOT (MAX(price_min) 
     FOR
     sales_tyre IN ('.1.9','.2.7'));

Bill Weiner · Accepted Answer

For the example you give I have some more general SQL below but this is for the static case you put in your example. Not sure if you are looking for a general / dynamic pivot functionality but if you are you will need to create some sort of SQL generator that takes the column value you want to pivot as input. This generator can be Redshift external code or a stored procedure but cannot be done in basic SQL on Redshift. The code below can be used as a template for use with such a SQL generator.

Set up (you had some decimal values defined as INT):

CREATE TABLE public.temp_car_id 
(
  car_id      VARCHAR(10),
  driver_id   INT,
  salesman    INT,
  tyre_id     INT,
  price_min   decimal(8,2),
  price_max   decimal(8,2),
  price_avg   decimal(8,2)
);

INSERT INTO public.temp_car_id
VALUES
('A',1,1,9,0.61,0.89,0.91),
('A',2,1,9,0.63,0.93,0.58),
('A',3,1,9,0.91,0.83,0.99),
('A',4,1,9,0.53,0.84,0.79),
('A',1,2,7,0.12,0.97,0.87),
('A',2,2,7,0.13,0.30,0.84),
('A',3,2,7,0.17,0.62,0.63),
('A',4,2,7,0.09,0.01,0.19);

SQL code:

select car_id, driver_id,
  min(avg_price_min_s1_t9) as avg_price_min_s1_t9,
  min(avg_price_max_s1_t9) as avg_price_max_s1_t9,
  min(avg_price_avg_s1_t9) as avg_price_mavg_s1_t9,
  min(avg_price_min_s2_t7) as avg_price_min_s2_t7,
  min(avg_price_max_s2_t7) as avg_price_max_s2_t7,
  min(avg_price_avg_s2_t7) as avg_price_mavg_s2_t7
from (
  SELECT car_id,
    driver_id,
    avg(case when tyre_id = 9 and salesman = 1 then price_min end) over( partition by car_id, driver_id, salesman, tyre_id)
        as avg_price_min_s1_t9,
    avg(case when tyre_id = 9 and salesman = 1 then price_max end) over( partition by car_id, driver_id, salesman, tyre_id)
        as avg_price_max_s1_t9,
    avg(case when tyre_id = 9 and salesman = 1 then price_avg end) over( partition by car_id, driver_id, salesman, tyre_id)
        as avg_price_avg_s1_t9,
    avg(case when tyre_id = 7 and salesman = 2 then price_min end) over( partition by car_id, driver_id, salesman, tyre_id)
        as avg_price_min_s2_t7,
    avg(case when tyre_id = 7 and salesman = 2 then price_max end) over( partition by car_id, driver_id, salesman, tyre_id)
        as avg_price_max_s2_t7,
    avg(case when tyre_id = 7 and salesman = 2 then price_avg end) over( partition by car_id, driver_id, salesman, tyre_id)
        as avg_price_avg_s2_t7
  FROM public.temp_car_id ) a
group by 1,2
order by 1,2;

It is unclear what aggregate function you want to use for combining these values as you only have 1 value per type so I used avg().

Multiple columns dynamic pivoting in redshift database

Answers (1)

Related Questions