user71812
user71812

Reputation: 457

Excel PowerQuery: How to unpivot or transpose a huge table into a readable format for analysis

I have this table that looks similar to this: enter image description here

And I want to transform it so that it looks like this:

enter image description here

The idea is to unpivot( or transpose) the table so that it can be feed into other BI tools, and be readable for analysis.

I have around 20 tables like this with 100+ cols, so of course to do it manually is nearly impossible.

How do I get this done with PowerQuery? I have tried using unpivot feature, but I am stuck as it displayed NYC1, NYC2, etc. VBA, macros don't work also. Any other suggestions are appreciated, but I am at my wits end now. Help!

Upvotes: 0

Views: 1043

Answers (2)

Tim Williams
Tim Williams

Reputation: 166341

Here's a pretty generic depivot approach which handles multiple row/column headers.

Select a cell in the source table before running (note - this uses CurrentRegion so will fail if your table has completely blank rows or columns).

Sub UnpivotIt()

    Dim numRowHeaders As Long, numColHeaders As Long
    Dim numRows As Long, numCols As Long, rng As Range
    Dim rngOut As Range, r As Long, c As Long, i As Long, n As Long
    Dim arrIn, arrOut, outRow As Long

    arrIn = Selection.CurrentRegion.Value

    numRowHeaders = Application.InputBox("How many header rows?", Type:=1)
    numColHeaders = Application.InputBox("How many header columns?", Type:=1)
    Set rngOut = Application.InputBox("Select output (top-left cell)", Type:=8)

    Set rngOut = rngOut.Cells(1) 'in case >1 cells selected

    numRows = UBound(arrIn, 1)
    numCols = UBound(arrIn, 2)

    ReDim arrOut(1 To ((numRows - numRowHeaders) * (numCols - numColHeaders)), _
                 1 To (numRowHeaders + numColHeaders + 1))

    outRow = 0
    For r = (numRowHeaders + 1) To numRows
    For c = (numColHeaders + 1) To numCols
        'only copy if there's a value
        If Len(arrIn(r, c)) > 0 Then
            outRow = outRow + 1
            i = 1
            For n = 1 To numColHeaders 'copy column headers
                arrOut(outRow, i) = arrIn(r, n)
                i = i + 1
            Next n
            For n = 1 To numRowHeaders '...row headers
                arrOut(outRow, i) = arrIn(n, c)
                i = i + 1
            Next n
            arrOut(outRow, i) = arrIn(r, c) '...and the value
        End If
    Next c
    Next r

    rngOut.Resize(outRow, UBound(arrOut, 2)).Value = arrOut

End Sub

Upvotes: 5

Marco Vos
Marco Vos

Reputation: 2968

Before loading into PowerQuery, concatenate the headers to the empty row after [Programe Name] using a delimiter (space). You can use the TEXTJOIN function to do this if you use office365. The result looks something like this (I did not copy all your data):

enter image description here

Import this range into PowerQuery and perform the following steps (Do not check the my table has headers checkbox)

  1. Remove top 2 rows (Home tab > Remove Rows)
  2. Use first row as headers (Home tab > Use First Row as Headers)
  3. Select the first column
  4. Unpivot other columns (dropdown menu unpivot columns on Transform Tab)
  5. Split the [Attribute] column by delimiter (space) (Home tab > Split Column)
  6. Change column names
  7. Move City column to the left (rightclick column > move > left)

The script looks like this:

let
    Source = Excel.CurrentWorkbook(){[Name="table"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type any}, {"Column3", type any}, {"Column4", type any}, {"Column5", type any}, {"Column6", type any}, {"Column7", type any}, {"Column8", type any}, {"Column9", type any}}),
    #"Removed Top Rows" = Table.Skip(#"Changed Type",2),
    #"Promoted Headers" = Table.PromoteHeaders(#"Removed Top Rows", [PromoteAllScalars=true]),
    #"Changed Type1" = Table.TransformColumnTypes(#"Promoted Headers",{{"Program Name", type text}, {"NY Budget", Int64.Type}, {"NY Revenue", Int64.Type}, {"NY Cost", Int64.Type}, {"NY Margin", Int64.Type}, {"LA Budget", Int64.Type}, {"LA Revenue", Int64.Type}, {"LA Cost", Int64.Type}, {"LA Margin", Int64.Type}}),
    #"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type1", {"Program Name"}, "Attribute", "Value"),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Unpivoted Other Columns", "Attribute", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"Attribute.1", "Attribute.2"}),
    #"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Attribute.1", type text}, {"Attribute.2", type text}}),
    #"Renamed Columns" = Table.RenameColumns(#"Changed Type2",{{"Attribute.1", "City"}, {"Attribute.2", "Description"}}),
    #"Reordered Columns" = Table.ReorderColumns(#"Renamed Columns",{"City", "Program Name", "Description", "Value"})
in
    #"Reordered Columns"

And this is the result (in the Power Query Editor)

enter image description here

Upvotes: 4

Related Questions