SpiderK
SpiderK

Reputation: 55

Deleting all rows that have same name in a column

I'm looking to see how I can delete any rows that have a duplicate name. For example, I have a name in column A that has these:

A B C D
ABC123 0 1 1
ABC123 1 2 2
X2X2 1 1 0
X1XD-01 1 0 0
BC-56 0 2 1
BC-56 1 1 1
YUA09 0 0 1
GGO-09S 0 1 2

Any name in column A that has a duplicate value, both of them are deleted, such that the rows are gone.

Goal:

A B C D
X2X2 1 1 0
X1XD-01 1 0 0
YUA09 0 0 1
GGO-09S 0 1 2

What is the most efficient way to approach this?

Thanks

Upvotes: 0

Views: 390

Answers (3)

Ronak Shah
Ronak Shah

Reputation: 389175

Count the frequency with table and select only those values with 1 row.

subset(df, A %in% names(Filter(function(x) x == 1, table(A))))

#        A B C D
#3    X2X2 1 1 0
#4 X1XD-01 1 0 0
#7   YUA09 0 0 1
#8 GGO-09S 0 1 2

Upvotes: 1

GuedesBF
GuedesBF

Reputation: 9878

We can group_by the desired column and filter out all groups with n() >=2:

library(dplyr)

df %>% group_by(A) %>% filter(n()==1)

# A tibble: 4 x 4
# Groups:   A [4]
  A           B     C     D
  <chr>   <int> <int> <int>
1 X2X2        1     1     0
2 X1XD-01     1     0     0
3 YUA09       0     0     1
4 GGO-09S     0     1     2

Upvotes: 2

akrun
akrun

Reputation: 887651

We can use duplicated to create a logical vector

df1[!(duplicated(df1$A)|duplicated(df1$A, fromLast = TRUE)),]
        A B C D
3    X2X2 1 1 0
4 X1XD-01 1 0 0
7   YUA09 0 0 1
8 GGO-09S 0 1 2

data

df1 <- structure(list(A = c("ABC123", "ABC123", "X2X2", "X1XD-01", "BC-56", 
"BC-56", "YUA09", "GGO-09S"), B = c(0L, 1L, 1L, 1L, 0L, 1L, 0L, 
0L), C = c(1L, 2L, 1L, 0L, 2L, 1L, 0L, 1L), D = c(1L, 2L, 0L, 
0L, 1L, 1L, 1L, 2L)), class = "data.frame", row.names = c(NA, 
-8L))

Upvotes: 0

Related Questions