Reputation: 1
I have an excel file which has a pivoted table. I want to bring it into a long format so the same can be inserted into a DB
.
loading the file in df.
file_path = '/content/drive/My Drive/Experiment_unpivot.xlsx' df = pd.read_excel(file_path,header=[0,1])
df the columns are merged in the excel file so getting
Unnamed: 0_level_0 2023-09-01 00:00:00
Please advise what needs to be done. Using pd.melt or using power query. I want the melted table
Upvotes: 0
Views: 80
Reputation: 21428
Your data in general makes no sense because it has the same column names twice under each month. However, here is sample code for Powerquery
let Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
#"Removed Top Rows" = Table.Skip(Source,1),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Top Rows"),
GetMonthNames = Table.FromColumns({Record.FieldValues(Source{0})}),
#"Filled Down" = Table.ToColumns(Table.FillDown(GetMonthNames,{"Column1"})){0},
NewColumnNames_Part1 = List.Zip ({#"Filled Down",Table.ColumnNames(#"Promoted Headers")}),
NewColumnNames_Part2 = List.Transform(NewColumnNames_Part1, each Text.Combine(_,"-")),
#"Rename Columns"=Table.RenameColumns(Source,List.Zip({Table.ColumnNames(Source),NewColumnNames_Part2})) ,
#"Removed Two Rows" = Table.Skip(#"Rename Columns",2),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Removed Two Rows", {"Products"}, "Attribute", "Value")
in #"Unpivoted Other Columns"
You can right-click the Attribute column and split on - character if you want
Upvotes: 0