Alkibe
Alkibe

Reputation: 15

Trouble Finding Duplicate Values Using Arrayformula on Google Sheets

I am trying to set up a Google Sheet arrayformula to find any duplicates without identifying the first instance of that value. I have a table with the values below:

A

1

2

3

4

5

Since number 2 is a duplicate value, I would like to identify this with a formula but I would not like to indentify the first value in this column. This formula gets me the results I am looking for: =IF(COUNTIF($A$2:$A2,A2)=1, "Unique", "Duplicate")

A B

1 Unique

2 Unique

3 Unique

2 Duplicate

4 Unique

2 Duplicate

But when I try to convert this to an arrayformula so I don't have to manually drag the formula down when new rows are added I get a different result. This is the arrayformula I used: =ARRAYFORMULA(IF($A$2:$A="", "", IF(COUNTIF($A$2:$A,A2:A)=1, "Unique", "Duplicate")))

A B

1 Unique

2 Duplicate

3 Unique

2 Duplicate

4 Unique

2 Duplicate

The problem is that the first value is also identified as duplicate. What would be the best way to convert =IF(COUNTIF($A$2:$A2,A2)=1, "Unique", "Duplicate") into an arrayformula?

Upvotes: 1

Views: 2006

Answers (2)

media10k
media10k

Reputation: 1

This is a very nice article to find duplicate entry. But I would like to add one missing point. That is, if you incorporate a TRIM function with the above functions, this will become more accurate and perfect. Because if we add spaces at the beginning or end in the duplicate cells, then the above functions won’t consider it as duplicate. I did this to monitor my employees work sheets as well. i.e

ARRAYFORMULA(if(len(TRIM(C9:C)),(if((countif(TRIM(C9:C),TRIM(C9:C)))>1,”duplicate”,)),))

Upvotes: 0

Erik Tyler
Erik Tyler

Reputation: 9345

You want this:

=ArrayFormula(IF(A2:A="",,IF(COUNTIFS(A2:A,A2:A,ROW(A2:A),"<="&ROW(A2:A))=1,"Unique","Duplicate")))

The problem with your COUNTIF is that you essentially asked "Is this number unique against every other number in this column? Or is it duplicated anywhere else in this column?" That is why 2 says "Duplicate" in all instances: because each of them is duplicated "somewhere else" in the column.

What you really want to be asking is "Up to this row, has this number been duplicated yet so far?" And that requires COUNTIFS with a second condition that only checks considering ROW() numbers "up to" (i.e., "<=") the current row.

Upvotes: 3

Related Questions