J. Doe
J. Doe

Reputation: 1740

R - function like duplicated that removes all of the duplicated instances

Let's say we have the following:

c("A", "A", "B") %>% 
   cbind(1:3) %>% 
   data.frame() -> testdf

We want to remove from the dataframe all instances where there was a duplicate in the first variable. Usually we would use something like this:

testdf2 <- testdf[!duplicated(testdf$.),]

However, testdf2 looks like this:

. V2
A  1
B  3

This is not what I was looking for - since the value A was duplicated, I want to remove all cases that have A in the first variable. I want my output to be like this:

. V2
B  3

Is there a function that could produce this?

Upvotes: 4

Views: 79

Answers (4)

akrun
akrun

Reputation: 887078

We can use subset with table

subset(testdf, `.` %in% names(which(table(`.`) == 1)))
# . V2
#3 B  3

Upvotes: 1

F Trias
F Trias

Reputation: 189

If want to stick with pipes

 testdf %>% group_by(testdf$.) %>% summarise(num_x=n()) %>% filter(num_x==1)

Upvotes: 2

NelsonGon
NelsonGon

Reputation: 13319

Another basealternative(retains row names):

testdf[-which(testdf$`.` %in% testdf[duplicated(testdf$.),1]),]
  . V2
3 B  3

Upvotes: 3

Frank Zhang
Frank Zhang

Reputation: 1688

try testdf[!duplicated(testdf$.)&!duplicated(testdf$.,fromLast = TRUE),]

Upvotes: 6

Related Questions