Reputation: 107
I am using Excel power query to combine and process data from multiple csv files located in a folder. All csv files have the same column structure (shown below) but a different number of rows or records.
Year | Period | Week | District | Route | SaleAmt |
---|---|---|---|---|---|
2024 | 10 | 1 | 123 | ABC | 1000 |
Here is the link to the sample data files: sample data files
Once data from all csv files is combined into one table, I create a new column "PK" by concatenating the values in fields "Period", "Week", "District" and "Route".
Next, to get the list of unique "PK" values in the combined data, I group by the "PK" column and aggregate the "SaleAmt" column using the operation "All Rows". This will create a new column containing tables of all rows grouped by the unique "PK" values.
Finally, I need to generate a table structure as follows, which maps the "SaleAmt" value from each csv file to the corresponding "PK" value:
PK | Year | Period | Week | District | Route | File-1 SaleAmt | File-2 SaleAmt | File-3 SaleAmt |
---|---|---|---|---|---|---|---|---|
101123ABC | 2024 | 10 | 1 | 123 | ABC | 1000 | 500 | 800 |
11469HBG | 2024 | 11 | 4 | 69 | HBG | null | null | 999 |
Note:
Here is what I have done so far: (M-code below)
let
Source = Folder.Files("Folder containing csv files"),
#"Filtered Hidden Files1" = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true),
#"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File", each #"Transform File"([Content])),
#"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
#"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File"}),
#"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File", Table.ColumnNames(#"Transform File"(#"Sample File"))),
#"Added Custom" = Table.AddColumn(#"Expanded Table Column1", "PK", each Text.Combine({[Period],[Week],[District],[Route]},"")),
#"Grouped Rows" = Table.Group(#"Added Custom", {"PK"}, {{"SaleAmt", each _, type table [Source.Name=text, Year=text, Period=text, Week=text, District=text, Route=text, SaleAmt=text, PK=text]}})
in
#"Grouped Rows"
How do I get to the final table structure after creating a unique list of "PK" values? Any help / pointers will be appreciated.
Upvotes: 1
Views: 128
Reputation: 12061
Expand then do a pivot.
let
Source = Folder.Files("Folder containing csv files"),
#"Filtered Hidden Files1" = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true),
#"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File", each #"Transform File"([Content])),
#"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
#"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File"}),
#"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File", Table.ColumnNames(#"Transform File"(#"Sample File"))),
#"Added Custom" = Table.AddColumn(#"Expanded Table Column1", "PK", each Text.Combine({[Period],[Week],[District],[Route]},"")),
#"Grouped Rows" = Table.Group(#"Added Custom", {"PK"}, {{"SaleAmt", each _, type table [Source.Name=text, Year=text, Period=text, Week=text, District=text, Route=text, SaleAmt=text, PK=text]}}),
#"Expanded SaleAmt" = Table.ExpandTableColumn(#"Grouped Rows", "SaleAmt", {"Year", "Period", "Week", "District", "Route", "Source.Name", "SaleAmt"}, {"Year", "Period", "Week", "District", "Route", "File", "SaleAmt"}),
#"Pivoted Column" = Table.Pivot(#"Expanded SaleAmt", List.Distinct(#"Expanded SaleAmt"[File]), "File", "SaleAmt")
in
#"Pivoted Column"
Upvotes: 1