Reputation: 1740
Let's say we have the following:
c("A", "A", "B") %>%
cbind(1:3) %>%
data.frame() -> testdf
We want to remove from the dataframe all instances where there was a duplicate in the first variable. Usually we would use something like this:
testdf2 <- testdf[!duplicated(testdf$.),]
However, testdf2
looks like this:
. V2
A 1
B 3
This is not what I was looking for - since the value A was duplicated, I want to remove all cases that have A in the first variable. I want my output to be like this:
. V2
B 3
Is there a function that could produce this?
Upvotes: 4
Views: 79
Reputation: 887078
We can use subset
with table
subset(testdf, `.` %in% names(which(table(`.`) == 1)))
# . V2
#3 B 3
Upvotes: 1
Reputation: 189
If want to stick with pipes
testdf %>% group_by(testdf$.) %>% summarise(num_x=n()) %>% filter(num_x==1)
Upvotes: 2
Reputation: 13319
Another base
alternative(retains row names):
testdf[-which(testdf$`.` %in% testdf[duplicated(testdf$.),1]),]
. V2
3 B 3
Upvotes: 3
Reputation: 1688
try testdf[!duplicated(testdf$.)&!duplicated(testdf$.,fromLast = TRUE),]
Upvotes: 6