Reputation: 11
I have a spreadsheet with 15 columns. Each column contain different amounts of data. So one column may have 100 row, the next could have 50 and the third a totally different number. All in all I have about 2000 entries spread across all columns. Now, some of the data is duplicate. I want to compare what data is duplicate across all the columns but ONLY if it is in every column. It might not be located in the same row across every column.
Conditional highlighting doesn't really work, as it will highlight every duplicate, not taking into account the number of columns.
Example:
A B C D E
1 2 1 2 1
2 4 2 1 4
3 1 8 5 3
In the above example, 1 goes again in all the columns while the other values don't. Therefor 1 fulfills my criteria and I want to output it to a list.
A have a complete list of all unique entries that I can compare against if needed.
Upvotes: 0
Views: 2647
Reputation: 11
I found a solution.
I ended up doing a countif for every single column comparing the possible values. So if I had a total sum where the number of rows was equal to the number of columns, the data was duplicate, otherwise not.
Upvotes: 0
Reputation: 34255
I feel there should be a more direct way of doing this, but how about:-
=SUM(IF(FREQUENCY(COLUMN($A$1:$E$3)*($A$1:$E$3=A1),COLUMN($A$1:$E$3)*($A$1:$E$3=A1))>0,1))>COLUMNS($A$1:$E$3)
The idea is to create an array in which only cells that match the current cell are set to 1, and multiply the cells in it by their column number. If all columns have at least one cell that matches A1, you should get 5 distinct values, plus one because zero is also counted.
If you have a list of all unique values, you could also use this same formula in a helper column to flag up the duplicates, then filter it.
Upvotes: 0
Reputation: 71
Given that you have a list of all your unique values, you can try with COUNTIF function.
For your example
A B C D E
1 2 1 2 1
2 4 2 1 4
3 1 8 5 3
Scenario A (painful)
A value can exist more than one time per column
Assume that in cell G2 exists your unique value
AND(COUNTIF($A:$A;G2)>0;COUNTIF($B:$B;G2)>0;COUNTIF($C:$C;G2)>0;COUNTIF($D:$D;G2)>0;COUNTIF($E:$E;G2)>0)
This gives the value TRUE for every value that exists at least one time per column
Scenario B
A value exists max 1 time per column
Assume that in cell G2 exists your unique value
AND(COUNTIF($A:$E;G2)=5)
Replace 5 with the total number of your table columns
This gives the value TRUE for every value that exists at least one time per column
Upvotes: 0