Reputation: 460
How can i remove rows with any same value that is in another column of the same row? For example,
df<-structure(list(V1 = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L), V2 = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L,
2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), V3 = c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L)), row.names = c(NA, -27L
), class = "data.frame")
##Top three rows
V1 V2 V3
1 1 1 1
2 2 1 1
3 3 1 1
4 1 2 1
5 2 2 1
6 3 2 1
7 1 3 1
8 2 3 1
In the following case (only showing 8 rows), I would remove every row accept rows 6 and 8 since they do not have any duplicate values in any column of the same row. I'm preferably looking for a data.table solution since I have a much larger dataframe.
Upvotes: 0
Views: 115
Reputation: 886938
An option using pairwise combn
on the columns to check if there are equal values
df[!Reduce(`|`, combn(df, 2, FUN = function(x)
x[[1]] == x[[2]], simplify = FALSE))]
V1 V2 V3
1: 3 2 1
2: 2 3 1
3: 3 1 2
4: 1 3 2
5: 2 1 3
6: 1 2 3
Upvotes: 0
Reputation: 388817
You may use anyDuplicated
for each row.
library(data.table)
setDT(df)
df[apply(df, 1, anyDuplicated) == 0]
# V1 V2 V3
#1: 3 2 1
#2: 2 3 1
#3: 3 1 2
#4: 1 3 2
#5: 2 1 3
#6: 1 2 3
Upvotes: 2