Reputation: 1232
I have this data
ID Month PRODUCT VALUE_1 VALUE_2
1234 1 a 34 12
1233 2 B 54
1245 3 c 23 42
1236 4 d 12 8
1238 1 a 56 5
1239 2 B 42 1
1234 3 c 32 6
1233 4 d 3
1245 1 a 8 6
1236 2 B 5 2
1238 3 c 1 6
1239 4 d 2
1234 1 a 15 4
1233 2 c 8 12
1245 3 d 15 6
1236 4 b 1 1
1238 1 c 14 10
1239 2 d 13 6
1234 3 c 13 17
1233 4 b 15 5
1245 1 c 18 11
1236 2 d 12 15
1238 3 c 8 12
1239 4 a 17 4
and trying to reach to this:
SUM a b c d
ID a_1 a_2 a_3 a_4 b_1 b_2 b_3 b_4 c_1 c_2 c_3 c_4 d_1 d_2 d_3 d_4
1233 0 0 0 0 0 54 0 15 0 8 0 0 0 0 0 0
1234 49 0 0 0 0 0 0 0 0 0 45 0 0 0 0 0
1236 0 0 0 0 0 5 0 1 0 0 0 0 0 12 0 12
1238 56 0 0 0 0 0 0 0 14 0 9 0 0 0 0 0
1239 0 0 0 17 0 42 0 0 0 0 0 0 0 13 0 2
1245 8 0 0 0 0 0 0 0 18 0 23 0 0 0 15 0
Basically, I want to see each product value aggregated for each month;
In python would be the equivalent to, just one line of code.
data.pivot_table(index=['ID'], columns=["Product","Month"],
values=["Value_1","Value_2"],
aggfunc=np.sum,
dropna = False, fill_value=0)
I've tried this:
WITH pivot_table as (SELECT * FROM
(
SELECT ID, PRODUCT,Value_1,Value_2, Month
FROM DATASET
) New_dataset
PIVOT (
--#2 aggregation
SUM(Value_1) as value1 , SUM(Value_2) as value2
--#3 Pivot_column
FOR PRODUCT IN ('a','b','c','d')
))
#as pivot_table
SELECT ID ,Month, SUM(value1_a) as a_1, SUM(value2_a) as a_2 FROM pivot_table
GROUP BY ID, Month
ORDER BY value_1
I've got so far to this table, which ignores the blanks, and still needs to be transposed
ID MONTH value1_a
1234 1 49
1238 1 56
1239 4 17
1245 1 8
But I will still need to run for all the others and concatenate? or do i have to write all the products? and then transpose? SQL might be able to do this in one go right? or am I thinking to much on python way?
Upvotes: 1
Views: 313
Reputation: 172954
I will still need to run for all the others and concatenate? or do i have to write all the products? and then transpose? SQL might be able to do this in one go right?
Below solution makes it
execute immediate (
select '''select * from (select id, lower(product) || '_' || month product_month, value_1 from `project.dataset.table`)
pivot(sum(value_1) for product_month in ("''' || string_agg(product_month, '", "') || '''"))
order by id
'''
from (
select product || '_' || month product_month
from (select distinct lower(product) product from `project.dataset.table`)
cross join (select distinct month from `project.dataset.table`)
order by product_month)
);
If applied to sample data in your question - output is
Upvotes: 1