satty
satty

Reputation: 23

comparing the columns and finding the values unique to a column using R

I have a matrix of dimension 20 *10. I wanted to find out values which are unique to the columns. A simple example would be a matrix like:

matrix(c("a","b","c","d","s","a","d","l","s","a","m","n"),ncol=3,dimnames=list(NULL,c("a","b","c")))

Looking like:

     a   b   c  
[1,] "a" "s" "s"
[2,] "b" "a" "a"
[3,] "c" "d" "m"
[4,] "d" "l" "n"

Using unique doesn't give what I want:

unique(c(mat)):
#[1] "a" "b" "c" "d" "s" "l" "m" "n"

Desired result:

        a    b    c 
[1,] "NA" "NA" "NA" 
[2,] "b"  "NA" "NA" 
[3,] "c"  "NA"  "m" 
[4,] "NA"  "l"  "n"

Upvotes: 1

Views: 108

Answers (1)

vrajs5
vrajs5

Reputation: 4126

New answer - I hope now you got your answer... Actually you wanted to find out non-duplicated items... :)

set.seed(1)
mat = matrix(c("a","b","c","d","s","a","d","l","s","a","m","n"),
             ncol=3,dimnames=list(NULL,c("a","b","c")))
mat
     a   b   c  
[1,] "a" "s" "s"
[2,] "b" "a" "a"
[3,] "c" "d" "m"
[4,] "d" "l" "n"

Now you have two approach. First one includes finding out only unique values....

notDuplicated = setdiff(c(mat),c(mat[duplicated(c(mat))]))
mat[!mat %in% notDuplicated] = NA 
mat
     a   b   c  
[1,] NA  NA  NA 
[2,] "b" NA  NA 
[3,] "c" NA  "m"
[4,] NA  "l" "n"

Second one you can find duplicate and eliminate them directly

Duplicated = c(mat[duplicated(c(mat))])
mat[mat %in% Duplicated] = NA
     a   b   c  
[1,] NA  NA  NA 
[2,] "b" NA  NA 
[3,] "c" NA  "m"
[4,] NA  "l" "n"

Upvotes: 2

Related Questions