Reputation: 21703
I'm trying to normalise some data that is supplied in Excel. The data is made up of a number of dimension columns followed by several measure columns over time. Unfortunately the data comes in with a single "Measure/Year" identifier which means that if there are 10 years of data and 4 measures, there will be 40 measure columns.
I can't select specific columns to unpivot as the number of columns will change over time and I want to automate this completely.
A simplified sample of data looks like this (just showing 2 measures over 3 years in this example - but potentially 5 measures over an ever increasing number of years).
Country | Category | Product | QTY_2018 | QTY_2019 | QTY_2020 | Value_2018 | Value_2019 | Value_2020 |
---|---|---|---|---|---|---|---|---|
France | Fruit | Apple | 10 | 20 | 30 | 11 | 22 | 33 |
France | Fruit | Orange | 40 | 50 | 60 | 44 | 55 | 66 |
Germany | Veg | Carrot | 70 | 80 | 90 | 77 | 88 | 99 |
What I would like to achieve is...
Country | Category | Product | Year | QTY | Value |
---|---|---|---|---|---|
France | Fruit | Apple | 2018 | 10 | 11 |
France | Fruit | Apple | 2019 | 20 | 22 |
France | Fruit | Apple | 2020 | 30 | 33 |
France | Fruit | Orange | 2018 | 40 | 44 |
France | Fruit | Orange | 2019 | 50 | 55 |
France | Fruit | Orange | 2020 | 60 | 66 |
Germany | Veg | Carrot | 2018 | 70 | 77 |
Germany | Veg | Carrot | 2019 | 80 | 88 |
Germany | Veg | Carrot | 2020 | 90 | 99 |
So far I have selected all the non-measure columns and then applied a transform "Unpivot other columns", and then creating 2 custom columns to get the measure name (Qty or Value in this example) and the year. This gets around the problem of the varying number of measure columns but that only gets me so far.
I now have data that looks like this
Country | Category | Product | Year | Measure | Amount |
---|---|---|---|---|---|
France | Fruit | Apple | 2018 | QTY | 10 |
France | Fruit | Apple | 2018 | Value | 11 |
and so on...
Notes:
If this can be done in the datamodel I'm happy to go with that too.
I maybe going about this the wrong way with my attempts so far but my Power Query knowledge is pretty basic so any help would be gratefully received.
Upvotes: 0
Views: 314
Reputation: 7891
You're nearly there. Just Pivot on your "Measure" column, to complete the output:
Unpivoted = Table.UnpivotOtherColumns(Source, {"Country", "Category", "Product"}, "Attribute", "Value"),
#"Split Column" = Table.SplitColumn(Unpivoted, "Attribute", Splitter.SplitTextByEachDelimiter({"_"}, QuoteStyle.Csv, false), {"Measure", "Year"}),
#"Pivoted Column" = Table.Pivot(#"Split Column", List.Distinct(#"Split Column"[Measure]), "Measure", "Value")
Upvotes: 1
Reputation: 34075
You should be able to just repivot on the new Measure column to get your desired result now.
Upvotes: 0