pin4o
pin4o

Reputation: 3

Transpose repeating data from rows into columns Excel

I have data set of basic housing data in the following format:

Existing data format:

enter image description here

That format is the same and reapeats for hundrets of properties. I would like to transform that that into a table format like the following example:

Property Type Price Location Region Additional info Area
House 252000 London Kensington 4500 square meters
... ... ... ... ... etc

In other words I want to make the text before ":" symbol column name with the text after it the data that goes into into the corresponding cell and to repeat that for hundrets of sites. Usually there is missing(no data) in Additional info but sometimes there is. I am not shure which is the best program to do this. So far in my mind comes Excel but if there is an easier way I will be glad to use it.

Upvotes: 0

Views: 1672

Answers (2)

Ron Rosenfeld
Ron Rosenfeld

Reputation: 60174

You can obtain your desired output using Power Query, available in Windows Excel 2010+ and Office 365 Excel

  • Select some cell in your original table
  • Data => Get&Transform => From Table/Range
  • When the PQ UI opens, navigate to Home => Advanced Editor
  • Make note of the Table Name in Line 2 of the code.
  • Replace the existing code with the M-Code below
  • Change the table name in line 2 of the pasted code to your "real" table name
  • Examine any comments, and also the Applied Steps window, to better understand the algorithm and steps

Note: The fnPivotAll function is a custom function that enables a method of creating a non-aggregated Pivot Table where there are multiple values per Pivot Column. From the UI, you add this as a New Query from Blank, and just paste that M-code in place of what's there

M-Code (for main query)

let

    //Read in data
    //Change table name in next line to your actural table name
    Source = Excel.CurrentWorkbook(){[Name="Table1_2"]}[Content],

    //Split by comma into new rows
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(Source, {{"Column1", 
        Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), 
            let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1"),

    //Remove the blank rows
    #"Filtered Rows" = Table.SelectRows(#"Split Column by Delimiter", each ([Column1] <> "" and [Column1] <> " ")),

    //Split by the rightmost colon only into new columns
    #"Split Column by Delimiter1" = Table.SplitColumn(#"Filtered Rows", "Column1", 
        Splitter.SplitTextByEachDelimiter({":"}, QuoteStyle.Csv, true), {"Column1.1", "Column1.2"}),

    //Split by the remaining colon into new rows
    // So as to have empty rows under "Additional data"
    //Then Trim the columns to remove leading/trailing spaces
    #"Split Column by Delimiter2" = Table.ExpandListColumn(Table.TransformColumns(#"Split Column by Delimiter1", {{"Column1.1", Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1.1"),
    #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter2",{{"Column1.1", type text}, {"Column1.2", type text}}),
    #"Trimmed Text" = Table.TransformColumns(#"Changed Type",{{"Column1.1", Text.Trim, type text}, {"Column1.2", Text.Trim, type text}}),

    //Create new column processing "Additional Data" to show a blank
    //  and Price to just show the numeric value, splitting from "EUR"
    #"Added Custom" = Table.AddColumn(#"Trimmed Text", "Custom", each if [Column1.1] = "Additional data" then " " 
        else if [Column1.1] = "Price" then Text.Split([Column1.2]," "){1} else [Column1.2]),

    //Remove unneeded column
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Column1.2"}),

    //non-aggregated pivot
    pivot = fnPivotAll(#"Removed Columns","Column1.1","Custom"),

    //set data types (frequently a good idea in PQ
    #"Changed Type1" = Table.TransformColumnTypes(pivot,{
        {"Property type", type text}, 
        {"Location", type text}, 
        {"region", type text}, 
        {"Additional data", type text}, 
        {"Area", type text}, 
        {"Price", Currency.Type}})
in
    #"Changed Type1"

M-Code (for custom function)
be sure to rename this query: fnPivotAll

//credit: Cam Wallace  https://www.dingbatdata.com/2018/03/08/non-aggregate-pivot-with-multiple-rows-in-powerquery/

(Source as table,
    ColToPivot as text,
    ColForValues as text)=> 

let
     PivotColNames = List.Buffer(List.Distinct(Table.Column(Source,ColToPivot))),
     #"Pivoted Column" = Table.Pivot(Source, PivotColNames, ColToPivot, ColForValues, each _),
 
    TableFromRecordOfLists = (rec as record, fieldnames as list) =>
    
    let
        PartialRecord = Record.SelectFields(rec,fieldnames),
        RecordToList = Record.ToList(PartialRecord),
        Table = Table.FromColumns(RecordToList,fieldnames)
    in
        Table,
 
    #"Added Custom" = Table.AddColumn(#"Pivoted Column", "Values", each TableFromRecordOfLists(_,PivotColNames)),
    #"Removed Other Columns" = Table.RemoveColumns(#"Added Custom",PivotColNames),
    #"Expanded Values" = Table.ExpandTableColumn(#"Removed Other Columns", "Values", PivotColNames)
in
    #"Expanded Values"

enter image description here

Upvotes: 0

Harun24hr
Harun24hr

Reputation: 36770

As per my below screenshot Excel 365 I have used following formulas.

C2=FILTERXML("<t><s>"&SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,1,4)),": ","</s><s>")&"</s></t>","//s[last()]")
D2=FILTERXML("<t><s>"&SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,2,4)),": ","</s><s>")&"</s></t>","//s[last()]")
E2=FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,3,4)),",","</s><s>"),":","</s><s>")&"</s></t>","//s[2]")
F2=FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,3,4)),",","</s><s>"),":","</s><s>")&"</s></t>","//s[last()-1]")
H2=FILTERXML("<t><s>"&SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,4,4)),": ","</s><s>")&"</s></t>","//s[last()]")

If you are not in Excel 365 then can try-

=FILTERXML("<t><s>"&SUBSTITUTE(INDEX($A:$A,ROW($A1)+(ROW($A1)-1)*3),": ","</s><s>")&"</s></t>","//s[last()]")

Basically =ROW(A1)+(ROW(A1)-1)*3 will generate a sequence of row numbers and INDEX($A:$A,ROW($A1)+(ROW($A1)-1)*3) will return value from Column A as per that sequence. Then FILTERXML() will return expected value specified in xPath parameter.

To know, how FILTERXML() works yo can read this article from JvdV. This is a fantastic article for FILTERXML() lover.

enter image description here

Upvotes: 2

Related Questions