Ian
Ian

Reputation: 31

How to flag duplicate values in r - newbie

I'm trying to flag duplicate IDs in another column. I don't necessarily want to remove them yet, just create an indicator (0/1) of whether the IDs are unique or duplicates. In sql, it would be like this:

SELECT ID, count(ID) count from TABLE group by ID) a On TABLE.ID = a.ID set ID Duplicate Flag Column 1 = 1 where count > 1;

Is there a way to do this simply in r? Any help would be greatly appreciated.

Upvotes: 2

Views: 4561

Answers (1)

Henry
Henry

Reputation: 6784

As an example of duplicated let's start with some values (numbers here, but strings would do the same thing)

x <- c(9, 1:5, 3:7, 0:8)
x
# 9 1 2 3 4 5 3 4 5 6 7 0 1 2 3 4 5 6 7 8 

If you want to flag the second and later copies

as.numeric(duplicated(x))
# 0 0 0 0 0 0 1 1 1 0 0 0 1 1 1 1 1 1 1 0

If you want to flag all values that occur two or more times

as.numeric(x %in% x[duplicated(x)])
# 0 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 0

Upvotes: 1

Related Questions