ashim
ashim

Reputation: 25580

How to remove repeated elements in a vector, similar to 'set' in Python

I have a vector with repeated elements, and would like to remove them so that each element appears only once.

In Python I could construct a Set from a vector to achieve this, but how can I do this in R?

Upvotes: 55

Views: 82254

Answers (4)

robertspierre
robertspierre

Reputation: 4430

setdiff(A,B) automatically drop duplicates from both A and B, you can pass NA as second argument:

> v = c(1, 1, 5, 5, 2, 2, 6, 6, 1, 3)
> setdiff(v,NA)
[1] 1 5 2 6 3

Upvotes: 0

Paul Rougieux
Paul Rougieux

Reputation: 11419

To remove contiguous duplicated elements only, you can compare the vector with a shifted version of itself:

v <- c(1, 1, 5, 5, 5, 5, 2, 2, 6, 6, 1, 3, 3)
v[c(TRUE, !v[-length(v)] == v[-1])]
[1] 1 5 2 6 1 3

The same can be written a little more elegantly using dplyr:

library(dplyr)
v[v != lag(v)]
[1] NA  5  2  6  1  3

The NA returned by lag() removes the first value, to keep the first value, you can change the default to a value that will be different from the first value.

v[v != lag(v, default = !v[1])]
[1] 1 5 2 6 1 3

Upvotes: 7

sus_mlm
sus_mlm

Reputation: 1154

You can check out unique function.

 > v = c(1, 1, 5, 5, 2, 2, 6, 6, 1, 3)
 > unique(v)
 [1] 1 5 2 6 3

Upvotes: 88

dardisco
dardisco

Reputation: 5274

This does the same thing. Slower, but useful if you also want a logical vector of the duplicates:

v[duplicated(v)]

Upvotes: 11

Related Questions