Finding duplicate elements in multiple pairs of columns in R

Question

New to R and to programming. This might be an easy question. I'm trying to find duplicate elements in certain pairs of columns, and replace both the original and the duplicate with N/A. So if I have the following dataset:

mydf <- structure(list(V1 = c(1, 2, 3, 1, 3, 2) V2 = c("zz", "aa", "bb", "zz", "yy", 
"ii"), V3 = c("aa", "ff", "aa", "hh", "cc", "jj"), V4 = c("ee", 
"xx", "ee", "hh", "dd", "kk"), V5 = c(213L, 254L, 235L, 356L, 
796L, 954L)), class = "data.frame", row.names = c(NA, -6L))

  V1 V2 V3 V4  V5
1  1 zz aa ee 213
2  2 aa ff xx 254
3  3 bb aa ee 235
4  1 zz hh hh 356
5  3 yy cc dd 796
6  2 ii jj kk 954

I'd like to find rows that are duplicate either in V1 and V2, or in V3 and V4. So the final result would look like this:

    V1   V2   V3   V4  V5
1   N/A  N/A  N/A  N/A 213
2    2   aa   ff   xx  254
3    3   bb   N/A  N/A 235
4   N/A  N/A  hh   hh  356
5    3   yy   cc   dd  796
6    2   ii   jj   kk  954

Finding duplicate elements in multiple pairs of columns in R

Answers (1)

Related Questions