Reputation: 119
I want to find duplicates horizontally and keeping the uniques. Please help me with this. I am sharing a sample dataset. Hope this helps.
X <- c(1,2,3,4,5)
Y <- c("India","India","Philippines","Netherlands","France")
Z <- c("India","India","Netherlands","France","France")
S <- c("India","France","Netherlands","France","India")
TableTest <- data.frame(X,Y,Z,S)
TableTest
Input dataset
X Y Z S
1 1 India India India
2 2 India India France
3 3 Philippines Netherlands Netherlands
4 4 Netherlands France France
5 5 France France India
Expected Output
X Y Z S
1 1 India NA NA
2 2 India France NA
3 3 Philippines Netherlands NA
4 4 Netherlands France NA
5 5 France India NA
Please help.
Upvotes: 0
Views: 70
Reputation: 102625
Another base R option
TableTest[-1] <- do.call(rbind,lapply(apply(TableTest[-1],1,unique),`length<-`,ncol(TableTest)-1))
or a simpler version (thanks for advice by @Onyambu in the comments)
TableTest[-1] <- t(apply(TableTest[-1], 1, function(x)`length<-`(unique(x),ncol(TableTest[-1]))))
which gives
> TableTest
X Y Z S
1 1 India <NA> <NA>
2 2 India France <NA>
3 3 Philippines Netherlands <NA>
4 4 Netherlands France <NA>
5 5 France India <NA>
Upvotes: 2
Reputation: 4233
My solution:
TableTest[2:4] <- as.data.frame(t(apply(TableTest[2:4], 1, function(x) {
xo <- ifelse(!duplicated(x), x, NA_character_)
if (any(is.na(xo))) xo <- xo[!is.na(xo)]
length(xo) <- ncol(TableTest) - 1
xo
})))
Output
> TableTest
X Y Z S
1 1 India <NA> <NA>
2 2 India France <NA>
3 3 Philippines Netherlands <NA>
4 4 Netherlands France <NA>
5 5 France India <NA>
Upvotes: 1
Reputation: 196
I don't think you can do it by only using data.frames, because you're moving values across columns. But here's one way to do it using matrices:
X <- c(1,2,3,4,5)
Y <- c("India","India","Philippines","Netherlands","France")
Z <- c("India","India","Netherlands","France","France")
S <- c("India","France","Netherlands","France","India")
output <- apply(cbind(Y,Z,S), 1, function(row) {
rm_dup <- unique(row)
return(c(rm_dup, rep(NA_character_,
3 - length(rm_dup))))
})
t(output)
[,1] [,2] [,3]
[1,] "India" NA NA
[2,] "India" "France" NA
[3,] "Philippines" "Netherlands" NA
[4,] "Netherlands" "France" NA
[5,] "France" "India" NA
Upvotes: 0
Reputation: 160827
TableTest[,-1] <- as.data.frame(t(apply(TableTest[,-1], 1, function(a) { a <- replace(a, duplicated(a), NA_character_); a[ order(is.na(a)) ]; })))
TableTest
# X Y Z S
# 1 1 India <NA> <NA>
# 2 2 India France <NA>
# 3 3 Philippines Netherlands <NA>
# 4 4 Netherlands France <NA>
# 5 5 France India <NA>
Upvotes: 2