super_gallagher
super_gallagher

Reputation: 67

Finding duplicate records in Power Query

I am trying to find duplicate records in Power Query. I do not need to remove them, I would just need to identify them. I am putting up a sample here.

enter image description here

As you can see in the image below, the first record is actually not a duplicate unlike the row below. I am pretty new to using power query and all I did here was what I would generally do in EXCEL: enter image description here

I was wondering if somebody could tell me what could I be doing wrong here. Thank you

Upvotes: 1

Views: 8630

Answers (1)

horseyride
horseyride

Reputation: 21318

What you want to do is Right-click the ID column, Group By...

Click [Advanced]

It should default to Count Rows in the first row at bottom

[Add Aggregation] with New Column Name 'All" and operation 'All Rows'

enter image description here

Hit [OK]

Click on arrows at top of All column and [x] the ID column and any other columns that may have been there, here assumed to be called Other

enter image description here enter image description here

The count column will be >1 for dupe rows.

Then you can remove extra columns and filter if wanted

Or add a custom column with formula

=if [Count]>1 then "Dupe" else "Single"

enter image description here

let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Grouped Rows" = Table.Group(Source, {"ID"}, {{"Count", each Table.RowCount(_), type number}, {"All", each _, type table}}),
#"Expanded All" = Table.ExpandTableColumn(#"Grouped Rows", "All", {"ID", "Other"}, {"ID.1", "Other"})
in #"Expanded All"

Upvotes: 4

Related Questions