Reputation: 1388
I have a list where i want to remove the duplicate where "Date of Termination (Work)" column is the latest or null.
sample data
Id No Name Surname Date of Employment (Work) Date of Termination (Work)
12405 xxxx yyy 10/26/2018 2/6/2019
33418 mmm nnnn 1/1/2018 7/30/2018
33418 mmm nnnn 1/13/2017 12/31/2017
33616 rrrr sssss 7/13/2018 11/19/2018
33616 rrrr sssss 7/13/2018 null
48224 ttttt kkkk 7/15/2018 4/14/2019
the result should be
Id No Name Surname Date of Employment (Work) Date of Termination (Work)
12405 xxxx yyy 10/26/2018 2/6/2019
33418 mmm nnnn 1/1/2018 7/30/2018
33616 rrrr sssss 7/13/2018
48224 ttttt kkkk 7/15/2018 4/14/2019
Upvotes: 0
Views: 131
Reputation: 4486
I get expected output:
using the code below:
let
initialTable = Table.FromColumns({
{12405, 33418, 33418, 33616, 33616, 48224},
{"xxxx", "mmm", "mmm", "rrrr", "rrrr", "ttttt"},
{"yyy", "nnnn", "nnnn", "sssss", "sssss", "kkkk"},
{#date(2018, 10, 26), #date(2018, 01, 01), #date(2017, 1, 13), #date(2018, 7, 13), #date(2018, 7, 13), #date(2018, 7, 15)},
{#date(2019, 02, 06), #date(2018, 7, 30), #date(2017, 12, 31), #date(2018, 11, 19), null, #date(2019, 4, 14)}
}, type table [Id No = Int64.Type, Name = text, Surname = text, #"Date of Employment (Work)" = date, #"Date of Termination (Work)" = date]),
nullElseMaxComparer = (x as record, y as record) =>
let
a = Record.Field(x, "Date of Termination (Work)"),
b = Record.Field(y, "Date of Termination (Work)"),
comparison = if a = null then 2 else if b = null then -2 else Value.Compare(a, b)
in comparison,
maxOrNullPerGroup = Table.Group(initialTable, "Id No", {"toCombine", each Table.Max(_, nullElseMaxComparer)}),
combined = Table.FromRecords(maxOrNullPerGroup[toCombine])
in
combined
ID No
, you want the latest termination date.33616
, null
was preferred over 11/19/2018
. So I've assumed that, for the purposes of this question, null
is greater than any date.nullElseMaxComparer
is a custom comparison function which tries to give preference to null
values -- and can be passed directly to Table.Max
.Upvotes: 1
Reputation: 1634
Another way:
= Table.Distinct(Table.Sort(YourTable, {"Date of Termination (Work)", 1}), {"Id No"})
Upvotes: 0
Reputation: 1388
i found a workaround (group by max) but i am sure there is an easier and faster way
let
Source = Excel.CurrentWorkbook(){[Name="Table6"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Id No", Int64.Type}, {"Name", type text}, {"Surname", type text}, {"Date of Employment (Work)", type datetime}, {"Date of Termination (Work)", type datetime}, {"Duty", type text}, {"Citizenship", type text}, {"National ID", type text}, {"Passport Serial/No", type text}}),
#"Sorted Rows" = Table.Sort(#"Changed Type",{{"Id No", Order.Ascending}, {"Date of Termination (Work)", Order.Descending}}),
#"Grouped Rows" = Table.Group(#"Sorted Rows", {"Id No"}, {{"all", each _, type table [Id No=number, Name=text, Surname=text, #"Date of Employment (Work)"=datetime, #"Date of Termination (Work)"=datetime, Duty=text, Citizenship=text, National ID=text, #"Passport Serial/No"=text]}, {"maxdate", each List.Max([#"Date of Termination (Work)"]), type datetime}}),
#"Expanded all" = Table.ExpandTableColumn(#"Grouped Rows", "all", {"Name", "Surname", "Date of Employment (Work)", "Date of Termination (Work)", "Duty", "Citizenship", "National ID", "Passport Serial/No"}, {"Name", "Surname", "Date of Employment (Work)", "Date of Termination (Work)", "Duty", "Citizenship", "National ID", "Passport Serial/No"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded all", each ([#"Date of Termination (Work)"] = [maxdate]) ),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"maxdate"})
in
#"Removed Columns"
Upvotes: 0