Reputation: 13
I am trying to create a pivot table in Postgres from the following table
named product_info
:
CREATE TABLE product_info (
ID varchar(10),
date VARCHAR(20) NOT NULL,
product VARCHAR(20) NOT NULL,
product_cost integer
);
INSERT INTO product_info(ID, date, product, product_cost)
VALUES
('1', 'AUG-23','Laptop',100),
('1', 'AUG-23','Phone',80),
('1', 'AUG-23','Keypad',50),
('1', 'SEP-23','Laptop',200),
('1', 'SEP-23','Phone',100),
('1', 'SEP-23','Ipad',150),
('1', 'SEP-23','Keypad',80),
('2', 'AUG-23','Laptop',100),
('2', 'AUG-23','Phone',90),
('2', 'AUG-23','Keypad',70),
('2', 'SEP-23','Laptop',120),
('2', 'SEP-23','Phone',30),
('2', 'SEP-23','Ipad',80),
('2', 'SEP-23','Keypad',40);
Tried below PIVOT Query
SELECT * FROM crosstab(
'SELECT date, id,product, sum(product_cost)
FROM product_info
GROUP BY 1, 2,3
ORDER BY 1, 2,3'
,
'SELECT DISTINCT product FROM product_info ORDER BY 1'
) as cte(
date text,
id text,
"Ipad" numeric,
"Keypad" numeric,
"Laptop" numeric,
"Phone" numeric
)
RETURNED RESULT
date | id | Ipad | Keypad | Laptop | Phone |
---|---|---|---|---|---|
AUG-23 | 1 | NULL | 70 | 100 | 90 |
SEP-23 | 1 | 80 | 40 | 120 | 30 |
I get a single row per month, with id = '1'
and values for id = '2'
.
I expected one row per ID.
EXPECTED RESULT
id | date | Ipad | Keypad | Laptop | Phone |
---|---|---|---|---|---|
1 | AUG-23 | NULL | 50 | 100 | 80 |
2 | AUG-23 | NULL | 70 | 100 | 90 |
1 | SEP-23 | 150 | 80 | 200 | 100 |
2 | SEP-23 | 80 | 40 | 120 | 30 |
Upvotes: 1
Views: 199
Reputation: 656566
crosstab()
callQuoting the manual for crosstab()
:
source_sql
is an SQL statement that produces the source set of data. This statement must return onerow_name
column, onecategory
column, and onevalue
column. It may also have one or more “extra” columns. Therow_name
column must be first. Thecategory
andvalue
columns must be the last two columns, in that order. Any columns betweenrow_name
andcategory
are treated as “extra”. The “extra” columns are expected to be the same for all rows with the samerow_name
value.
Your query does not produce data in the expected form:
row_name | extra ... | category | value
This query does:
SELECT dense_rank() OVER (ORDER BY date, id) AS rn -- row_name
, to_char(date, 'MON-YY') AS date, id -- extra
, product -- category
, sum(product_cost) AS sum_cost -- value
FROM product_info
GROUP BY date, id, product
ORDER BY date, id, product;
Your "row name" is a composite of (date, id)
. (Simply concatenating could produce ambiguous results.) Generate a single-column row_name
with the window function dense_rank()
as demonstrated. date
and id
can be passed as "extra" columns.
Also, a SELECT
query that generates category names dynamically rarely makes sense, since the static (!) column definition list has to match.
The query is based off an actual date
column. Don't store date information as text
, which does not sort properly.
This crosstab()
call works:
-- CREATE TABLE pivot_test AS (
SELECT date, id, "Ipad", "Keypad", "Laptop", "Phone"
FROM crosstab(
$q$
SELECT dense_rank() OVER (ORDER BY date, id)
, to_char(date, 'MON-YY'), id
, product
, sum(product_cost)
FROM product_info
GROUP BY date, id, product
ORDER BY date, id, product
$q$
, $v$
SELECT unnest('{Ipad, Keypad,Laptop, Phone}'::text[])
$v$
) AS ct(rn int, date text, id int
, "Ipad" int
, "Keypad" int
, "Laptop" int
, "Phone" int);
See:
Typically, crosstab()
is the fastest option. But for only few products, and since this case requires aggregation and a window function anyway, plain SQL may be even faster.
SELECT to_char(date, 'MON-YY') AS date, id
, SUM(product_cost) FILTER (WHERE product = 'Ipad') AS "Ipad"
, SUM(product_cost) FILTER (WHERE product = 'Keypad') AS "Keypad"
, SUM(product_cost) FILTER (WHERE product = 'Laptop') AS "Laptop"
, SUM(product_cost) FILTER (WHERE product = 'Phone') AS "Phone"
FROM product_info
GROUP BY date, id
ORDER BY date, id;
See:
Upvotes: 0
Reputation: 49375
As you can use only obe row with crosstab, you can do something like this.
But simpler is a normal aggregation like the second query
SELECT split_part(id, '%', 1) as id,
split_part(id, '%', 2) as date,
"Ipad" ,
"Laptop" ,
"Phone" ,
"Keypad"
FROM crosstab(
'SELECT CONCAT(id,''%'',date) as id, product, sum(cost)
FROM product_info
GROUP BY 1, 2
ORDER BY 1, 2'
,
'SELECT DISTINCT product FROM product_info ORDER BY 1'
) as cte(
id text,
"Ipad" numeric,
"Laptop" numeric,
"Phone" numeric,
"Keypad" numeric
)
id | date | Ipad | Laptop | Phone | Keypad |
---|---|---|---|---|---|
1 | Aug-23 | null | 50 | 250 | 100 |
1 | Sep-23 | 200 | 60 | 260 | 100 |
2 | Aug-23 | null | 80 | 220 | 100 |
2 | Sep-23 | 90 | 80 | 100 | 100 |
SELECT
id,date,
SUM(CASE WHEN product = 'Ipad' THEN cost END ) as "Ipad",
SUM(CASE WHEN product = 'Laptop' THEN cost END ) as "Laptop",
SUM(CASE WHEN product = 'Phone' THEN cost END ) as "Phone",
SUM(CASE WHEN product = 'Keypad' THEN cost END ) as "Keypad"
FROM
product_info
GROUP BY id,date
ORDER BY id,date
id | date | Ipad | Laptop | Phone | Keypad |
---|---|---|---|---|---|
1 | Aug-23 | null | 250 | 100 | 50 |
1 | Sep-23 | 200 | 260 | 100 | 60 |
2 | Aug-23 | null | 220 | 100 | 80 |
2 | Sep-23 | 90 | 100 | 100 | 80 |
Upvotes: 0