Reputation: 65
I have a dataset table in PowerBi with this structure:
It's pretty straight forward; the column "Duration (s)" is the difference in seconds between "DateTime End" and "DateTime Start.
I need to create a visualization (a table/matrix) which would give me this result:
Basically it's a sort of group by of columns "Product" and "Machine" based on contiguous values of the column "Event Code" where i keep the first row "DateTime Start" and the last row "DateTime End", sum the rows "Duration (s)" values (or calculate it ex-novo, since it's always a time diff) and count the contiguous rows with the same "Event Code" in the column "Rows Count".
Since the dataset is fairly large (5m+ rows) i'd prefer a solution based on Dax rather than an M-based one, but any will suffice.
Upvotes: 2
Views: 712
Reputation: 60494
With M code, starting with your initial table, where I presume you've added a custom column for Duration (s)
:
Then, open the Advanced Editor and add the optional final argument for the Table.Group function: GroupKind.Local
This is what that argument is for
let
//orginal table
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("hZLBCsMgDIZfZeRcqInald1s77t0t+Jp9AXKxl5/rqcm0QVR+CAfJr+uKyzPfftABw8sRyqb+rIc0WW4oRMcIXeGEjmT08qkSjijraDgYCuBs680llQJZ6+VWZUwDuIW0oqT41eUZCQWbUW+y2gro5iNDuX+fv0EryNufJc/gmgKr1KYVUEtq7bQSOokWDkNkPMX", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Product = _t, Machine = _t, #"Event Code" = _t, #"DateTime Start" = _t, #"DateTime End" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{
{"Product", type text}, {"Machine", type text}, {"Event Code", type text},
{"DateTime Start", type datetime}, {"DateTime End", type datetime}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Duration (s)", each Duration.TotalSeconds([DateTime End]-[DateTime Start])),
//added code to group by contiguous event code groups
#"Grouped Rows" = Table.Group(#"Added Custom", {"Product", "Machine", "Event Code"}, {
{"DateTime Start", each List.Min([DateTime Start]), type nullable datetime},
{"DateTime End", each List.Max([DateTime End]), type nullable datetime},
{"Duration (s)", each List.Sum([#"Duration (s)"]), type number},
{"Rows Count", each Table.RowCount(_), Int64.Type}},
GroupKind.Local)
// ^^^^^
in
#"Grouped Rows"
Upvotes: 3
Reputation: 21428
This is a M version. No idea how it performs with that many records. I stuck in a Table.Buffer at one point where it seems to make sense
Basically, it creates a combined column of Product&Machine&Event, then offsets that one row downward. By comparing the offset to the current row value, we find where the breaks are and place an index. Fill down the index and you have something you can group on.
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(#"Added Index", "Combo", each [Product]&[Machine]&[Event Code]),
#"Changed Type" = Table.Buffer(Table.TransformColumnTypes(#"Added Custom",{{"Product", type text}, {"Machine", type text}, {"Event Code", type text}, {"DateTime Start", type datetime}, {"DateTime End", type datetime}, {"Duration (s)", Int64.Type}})),
// pull combined Product&Machine&Event cell from prior row
shiftedList = {null} & List.RemoveLastN(#"Changed Type"[Combo],1),
custom1 = Table.ToColumns(#"Changed Type") & {shiftedList},
custom2 = Table.FromColumns(custom1,Table.ColumnNames(#"Changed Type") & {"Previous Row"}),
#"Added Custom1" = Table.AddColumn(custom2, "Custom", each if [Previous Row]=null then [Index] else if [Combo]=[Previous Row] then null else [Index]),
#"Filled Down" = Table.FillDown(#"Added Custom1",{"Custom"}),
#"Grouped Rows" = Table.Group(#"Filled Down", {"Custom"}, {
{"Product", each [Product]{0} },
{"Machine", each [Machine]{0}},
{"Event Code", each [Event Code]{0}},
{"DateTime Start", each List.Min([DateTime Start]), type datetime},
{"DateTime End", each List.Max([DateTime End]), type datetime},
{"Duration (s)", each List.Sum([#"Duration (s)"]), type nullable number},
{"Rows Count", each Table.RowCount(_), Int64.Type}
}),
#"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"Custom"})
in #"Removed Columns"
Might be faster if you don't include the Duration column in any of this then calculate it separately in a final step
Upvotes: 3