dQlle
dQlle

Reputation: 11

Excel: find duplicate rows in multiple columns

I have a spreadsheet with 15 columns. Each column contain different amounts of data. So one column may have 100 row, the next could have 50 and the third a totally different number. All in all I have about 2000 entries spread across all columns. Now, some of the data is duplicate. I want to compare what data is duplicate across all the columns but ONLY if it is in every column. It might not be located in the same row across every column.

Conditional highlighting doesn't really work, as it will highlight every duplicate, not taking into account the number of columns.

Example:

    A   B   C   D   E
    1   2   1   2   1
    2   4   2   1   4
    3   1   8   5   3 

In the above example, 1 goes again in all the columns while the other values don't. Therefor 1 fulfills my criteria and I want to output it to a list.

A have a complete list of all unique entries that I can compare against if needed.

Upvotes: 0

Views: 2647

Answers (3)

dQlle
dQlle

Reputation: 11

I found a solution.

I ended up doing a countif for every single column comparing the possible values. So if I had a total sum where the number of rows was equal to the number of columns, the data was duplicate, otherwise not.

Upvotes: 0

Tom Sharpe
Tom Sharpe

Reputation: 34255

I feel there should be a more direct way of doing this, but how about:-

=SUM(IF(FREQUENCY(COLUMN($A$1:$E$3)*($A$1:$E$3=A1),COLUMN($A$1:$E$3)*($A$1:$E$3=A1))>0,1))>COLUMNS($A$1:$E$3)

The idea is to create an array in which only cells that match the current cell are set to 1, and multiply the cells in it by their column number. If all columns have at least one cell that matches A1, you should get 5 distinct values, plus one because zero is also counted.

enter image description here

If you have a list of all unique values, you could also use this same formula in a helper column to flag up the duplicates, then filter it.

Upvotes: 0

miraklis
miraklis

Reputation: 71

Given that you have a list of all your unique values, you can try with COUNTIF function.

For your example

A   B   C   D   E
1   2   1   2   1
2   4   2   1   4
3   1   8   5   3

Scenario A (painful)

A value can exist more than one time per column

Assume that in cell G2 exists your unique value

AND(COUNTIF($A:$A;G2)>0;COUNTIF($B:$B;G2)>0;COUNTIF($C:$C;G2)>0;COUNTIF($D:$D;G2)>0;COUNTIF($E:$E;G2)>0)

This gives the value TRUE for every value that exists at least one time per column

Scenario B

A value exists max 1 time per column

Assume that in cell G2 exists your unique value

AND(COUNTIF($A:$E;G2)=5)

Replace 5 with the total number of your table columns

This gives the value TRUE for every value that exists at least one time per column

Upvotes: 0

Related Questions