Seb
Seb

Reputation: 618

How to remove only one instance of a duplicate value in a vector in R?

Let's consider a vector of numeric values "x". Some values may be duplicates. I need to remove the max value one by one until x is empty.

Problem, if I use:

x <- x[x != max(x)]

It removes all duplicates equal to the maximum. I want to remove only one of the duplicates. So until now, I do:

max.x <- x[x == max(x)]
max.x <- max.x[1:length(max.x) - 1]
x <- c(x[x != max(x)], max.x)

But this is far from computationally efficient, and I'm not good enough at R to find the right way to do this. Can someone has a better trick?

Thanks

Upvotes: 3

Views: 1403

Answers (3)

Carl Witthoft
Carl Witthoft

Reputation: 21532

Just for fun,
x <- x[ -which.max(x)]

rinse, lather, repeat.

dagnabit howcome 4 spaces isn't causing code coloration?

Upvotes: 2

rainer
rainer

Reputation: 16

The way I understand your question,

 ?unique

might give you what you want.

Rgds, Rainer

Upvotes: 0

Thomson Comer
Thomson Comer

Reputation: 3919

You're not entirely clear what the scope of your problem is, so I'll just give the first suggestion I have that comes to mind. Use the sort function to get the list of values in decreasing order.

sorted <- sort(x,decreasing=TRUE,index.return=TRUE)

You can now iteratively remove the highest item from x. Re-using the sort function over and over on your subset data is inefficient - better to keep a permanent copy of x and do the removals from that, if possible.

Consider this approach

# random set of data with duplicates
x <- floor(runif(50)*15)
# sort with index.return returns a sorted x in sorted$x and the 
# indices of the sorted values from the original x in sorted$ix
sorted <- sort(x,decreasing=TRUE,index.return=TRUE)

for( i in 1:length(x) )
{
 # remove data from x
 newX <- x[-sorted$ix[1:i]]
 print(sort(newX,decreasing=TRUE))
}

Upvotes: 1

Related Questions