hamslice
hamslice

Reputation: 149

Recombine the data from 2 columns in Excel into a single column

I am limited to using Excel 2016. I know in Python I can achieve this very easily, but looking for a solution to make my life easier. I have a fairly large dataset which -generally- looks like the table below (the real dataset also broken into 3 shifts, but I have simplified it a bit):

Notes on the dataset

Example raw data

Date Machine No. Product Output Employees
01-Aug-2022 1 ABC 3,100,100 BOB/JON
01-Aug-2022 2 DCE 2,300,000 BOB/CATH/AMY
01-Aug-2022 3 EFG 4,500,6000 ZEE/IAN/GAZ
02-Aug-2022 1 ABC/HIJ 1,100,100/900,000 BOB/JON
02-Aug-2022 2 DCE 2,300,000 AMY
02-Aug-2022 3 EFG 4,500,6000 ZEE/IAN/GAZ
03-Aug-2022 1 HIJ/LMN 1,100,100/1,900,000 BOB
03-Aug-2022 2 DCE 2,300,000 GAZ
03-Aug-2022 3 EFG/PQR 1,500,600/1,700,000 ZEE/IAN/JON

What I have done so far...

I can use the "Text to Data" function in Excel, using the "/" character as a delimiter, to create new columns, which results in something like this:

Date Machine No. Product1 Product2 Output1 Output2 Employee1 Employee2 Employee3
01/Aug/2022 1 ABC 3,100,100 BOB JON
01/Aug/2022 2 DCE 2,300,000 BOB CATH AMY
01/Aug/2022 3 EFG 4,500,6000 ZEE IAN GAZ
02/Aug/2022 1 ABC HIJ 1,100,100 900,000 BOB JON
02/Aug/2022 2 DCE 2,300,000 AMY
02/Aug/2022 3 EFG 4,500,6000 ZEE IAN GAZ
03/Aug/2022 1 HIJ LMN 1,100,100 1,900,000 BOB
03/Aug/2022 2 DCE 2,300,000 GAZ
03/Aug/2022 3 EFG PQR 1,500,600 1,700,000 ZEE IAN JON

What I want to achieve...

My ideal output would be the following:

  1. When there are product changeovers on a shift, I would like the additional columns to be reintegrated back into the table, (as below)
  2. I want to count the number of unique employees for each shift. I currenlty have a formula to count this, =SUMPRODUCT(($AJ$48:$AL$56<>"")/COUNTIF($AJ$48:$AL$56,$AJ$48:$AL$56&"") , but I have to manually update the formula for every shift.
Date Machine No. Product Output Employee1 Employee2 Employee3 Total Employees
01/Aug/2022 1 ABC 3,100,100 BOB JON 7
01/Aug/2022 2 DCE 2,300,000 BOB CATH AMY 7
01/Aug/2022 3 EFG 4,500,6000 ZEE IAN GAZ 7
02/Aug/2022 1 ABC 1,100,100 BOB JON 6
02/Aug/2022 1 HIJ 900,000 BOB JON 6
02/Aug/2022 2 DCE 2,300,000 AMY 6
02/Aug/2022 3 EFG 4,500,6000 ZEE IAN GAZ 6
03/Aug/2022 1 HIJ 1,100,100 BOB 5
03/Aug/2022 1 LMN 1,900,000 BOB 5
03/Aug/2022 2 DCE 2,300,000 GAZ 5
03/Aug/2022 3 EFG 1,500,600 ZEE IAN JON 5
03/Aug/2022 3 PQR 1,700,000 ZEE IAN JON 5

Upvotes: 1

Views: 165

Answers (1)

Ron Rosenfeld
Ron Rosenfeld

Reputation: 60474

This can also be accomplished using Power Query, available in Windows Excel 2010+ and Excel 365 (Windows or Mac)

To use Power Query

  • Select some cell in your Data Table

  • Data => Get&Transform => from Table/Range or from within sheet

  • When the PQ Editor opens: Home => Advanced Editor

  • Make note of the Table Name in Line 2

  • Paste the M Code below in place of what you see

  • Change the Table name in line 2 back to what was generated originally.

  • Read the comments and explore the Applied Steps to understand the algorithm

  • You will also need a Custom Function for this method

  • After doing the above:

    • Right Click in the Queries window (on the left)
    • New Query => Other Sources => Blank Queryf
    • Paste the Custom function into that window
    • Rename the Custom function as per the notes

Main Code

let

//Read in Data
//Change next line to reflect actual data source
    Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],

//set data types
//seting "Output" to type text for the "split"
    #"Changed Type" = Table.TransformColumnTypes(Source,{
        {"Date", type date}, {"Machine No.", Int64.Type}, {"Product", type text}, {"Output", type text}, {"Employees", type text}}),

//Unpivot the columns we want to split into rows
//   This would be the Product and Output columns
// then split them into rows
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"Date", "Machine No.", "Employees"}, "Attribute", "Value"),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Unpivoted Columns", {{"Value", Splitter.SplitTextByDelimiter("/", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Value"),

//Pivot with no aggregation
//  Because of multiple items in each category, this is best done with a custom function
    Pivot = fnPivotAll(#"Split Column by Delimiter","Attribute","Value"),

//Set the data types, then re-order the columns and split the Employee column into a maximum of three new columns
    #"Changed Type1" = Table.TransformColumnTypes(Pivot,{{"Product", type text}, {"Output", Int64.Type}}),
    #"Reordered Columns" = Table.ReorderColumns(#"Changed Type1",{"Date", "Machine No.", "Product", "Output", "Employees"}),
    #"Split Column by Delimiter1" = Table.SplitColumn(#"Reordered Columns", "Employees", 
            Splitter.SplitTextByDelimiter("/", QuoteStyle.Csv), 
            {"Employees.1", "Employees.2", "Employees.3"}),
    #"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Employees.1", type text}, {"Employees.2", type text}, {"Employees.3", type text}}),

//Group by shift (= data?)
// Then count the distinct employees for each shift
    #"Grouped Rows" = Table.Group(#"Changed Type2", {"Date"}, {
        {"Count", each _, type table [Date=nullable date, #"Machine No."=nullable number, Product=nullable text, Output=nullable number, Employees.1=nullable text, Employees.2=nullable text, Employees.3=nullable text]},
        {"Total Employees", each 
            List.Count(
                List.Distinct(
                    List.RemoveNulls(
                        List.Combine(
                            Table.ToColumns(
                                Table.SelectColumns(_, List.LastN(Table.ColumnNames(_),3))))))), Int64.Type}
    }),

//Re-expand the grouped table
    #"Expanded Count" = Table.ExpandTableColumn(#"Grouped Rows", "Count", {"Machine No.", "Product", "Output", "Employees.1", "Employees.2", "Employees.3"}, {"Machine No.", "Product", "Output", "Employees.1", "Employees.2", "Employees.3"})
in
    #"Expanded Count"

Custom Function
Rename fnPivotAll

//credit: Cam Wallace  https://www.dingbatdata.com/2018/03/08/non-aggregate-pivot-with-multiple-rows-in-powerquery/

//Rename:  fnPivotAll 

(Source as table,
    ColToPivot as text,
    ColForValues as text)=> 

let
     PivotColNames = List.Buffer(List.Distinct(Table.Column(Source,ColToPivot))),
     #"Pivoted Column" = Table.Pivot(Source, PivotColNames, ColToPivot, ColForValues, each _),
 
    TableFromRecordOfLists = (rec as record, fieldnames as list) =>
    
    let
        PartialRecord = Record.SelectFields(rec,fieldnames),
        RecordToList = Record.ToList(PartialRecord),
        Table = Table.FromColumns(RecordToList,fieldnames)
    in
        Table,
 
    #"Added Custom" = Table.AddColumn(#"Pivoted Column", "Values", each TableFromRecordOfLists(_,PivotColNames)),
    #"Removed Other Columns" = Table.RemoveColumns(#"Added Custom",PivotColNames),
    #"Expanded Values" = Table.ExpandTableColumn(#"Removed Other Columns", "Values", PivotColNames)
in
    #"Expanded Values"

Source Data
enter image description here

Results
enter image description here

Upvotes: 1

Related Questions