remove rows with duplicate values in any other adjacent column

Question

How can i remove rows with any same value that is in another column of the same row? For example,

df<-structure(list(V1 = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
3L), V2 = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 
2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), V3 = c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L)), row.names = c(NA, -27L
), class = "data.frame")

##Top three rows
   V1 V2 V3
1   1  1  1
2   2  1  1
3   3  1  1
4   1  2  1
5   2  2  1
6   3  2  1
7   1  3  1
8   2  3  1

In the following case (only showing 8 rows), I would remove every row accept rows 6 and 8 since they do not have any duplicate values in any column of the same row. I'm preferably looking for a data.table solution since I have a much larger dataframe.

Ronak Shah · Accepted Answer

You may use anyDuplicated for each row.

library(data.table)

setDT(df)
df[apply(df, 1, anyDuplicated) == 0]

#   V1 V2 V3
#1:  3  2  1
#2:  2  3  1
#3:  3  1  2
#4:  1  3  2
#5:  2  1  3
#6:  1  2  3

remove rows with duplicate values in any other adjacent column

Answers (2)

Related Questions