Reputation: 1137
bear with me, this is my first attempt using the Power Query Formula Language. I need some advice on how to solve a particular problem sorting and filtering source data.
I now got this current source data, structured like this:
Using this power query:
let
Source = Excel.CurrentWorkbook(){[Name="EmployeeOrganization"]}[Content],
ListEmployees = Table.Group(Source, {"Organization"}, {{"Employee", each Text.Combine([Employee],","), type text}}),
CountEmployees = Table.AddColumn(ListEmployees, "Count", each List.Count(Text.Split([Employee],","))),
SplitEmployees = Table.SplitColumn(ListEmployees, "Employee", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv),List.Max(CountEmployees[Count])),
Transpose = Table.Transpose(SplitEmployees),
PromoteHeaders = Table.PromoteHeaders(Transpose, [PromoteAllScalars=true])
in
PromoteHeaders
I am able to produce the following result:
To avoid having to add the organization name to every single employee in the source, I would like the organization name to act as an parent-group, with the employees as children. I would also like the result to only fetch the organizations (+ employees) that has status Active = Yes.
The desired source should look similar to this:
So that the desired result should look similar to this: (Apple is gone due to Active = NO)
I am stuck at this point and need some advice on how can I modify my Power Query Formula to:
(Excel file can be found her)
Upvotes: 0
Views: 491
Reputation: 60224
In PQ, you'll need to fill in the blank rows, then Pivot with no aggregation.
See the comments in the code, and follow the Applied Steps
to understand the algorithm
Custom Function
Rename: fnPivotAll
//credit: Cam Wallace https://www.dingbatdata.com/2018/03/08/non-aggregate-pivot-with-multiple-rows-in-powerquery/
(Source as table,
ColToPivot as text,
ColForValues as text)=>
let
PivotColNames = List.Buffer(List.Distinct(Table.Column(Source,ColToPivot))),
#"Pivoted Column" = Table.Pivot(Source, PivotColNames, ColToPivot, ColForValues, each _),
TableFromRecordOfLists = (rec as record, fieldnames as list) =>
let
PartialRecord = Record.SelectFields(rec,fieldnames),
RecordToList = Record.ToList(PartialRecord),
Table = Table.FromColumns(RecordToList,fieldnames)
in
Table,
#"Added Custom" = Table.AddColumn(#"Pivoted Column", "Values", each TableFromRecordOfLists(_,PivotColNames)),
#"Removed Other Columns" = Table.RemoveColumns(#"Added Custom",PivotColNames),
#"Expanded Values" = Table.ExpandTableColumn(#"Removed Other Columns", "Values", PivotColNames)
in
#"Expanded Values"
Basic Query
let
//Read in data and set data types
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8k12yc9LzEkpVtJRAqLI1GKlWJ1oEDMgtSS1CCQK5XvlpyLzEvPgXMeCgpxUiH6/fJgC38SiSiT1jjmZyXAN7vn56TAdyDYmluYgaXHKTwLzYgE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Organization = _t, Employee = _t, Active = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Organization", type text}, {"Employee", type text}, {"Active", type text}}),
//replace blanks with null if not already there
#"Replaced Value" = Table.ReplaceValue(#"Changed Type","",null,Replacer.ReplaceValue,{"Organization", "Employee", "Active"}),
//fill down the Company and active columns
#"Filled Down" = Table.FillDown(#"Replaced Value",{"Organization", "Active"}),
//Filter to show only Active="Yes and Employee not null
#"Filtered Rows" = Table.SelectRows(#"Filled Down", each ([Employee] <> null) and ([Active] = "Yes")),
//Pivot with no aggregation
//could do this with grouping, but easier (and maybe faster, with a custom function
pivotAll = fnPivotAll(#"Filtered Rows","Organization","Employee"),
//remove unneeded Active column and set data types
#"Removed Columns" = Table.RemoveColumns(pivotAll,{"Active"}),
typed = Table.TransformColumnTypes(#"Removed Columns",
List.Transform(Table.ColumnNames(#"Removed Columns"),each {_, Text.Type}))
in
typed
Upvotes: 1