Rish
Rish

Reputation: 826

pairwise analysis in R

I have a large data-frame in which I have to find the columns when both rows are equal for pairs of individuals.

Here is an example of the dataframe:

>data
     ID pos1234 pos1345 pos1456 pos1678
 1    1       C       A       C       G
 2    2       C       G       A       G
 3    3       C       A       G       A
 4    4       C       G       C       T

I transformed the dataframe into a pairwise matrix with:

apply(data, 2, combn, m=2)


      ID  pos1234 pos1345 pos1456 pos1678
 [1,] "1" "C"     "A"     "C"     "G"
 [2,] "2" "C"     "G"     "A"     "G"
 [3,] "1" "C"     "A"     "C"     "G"
 [4,] "3" "C"     "A"     "G"     "A"
 [5,] "1" "C"     "A"     "C"     "G"
 [6,] "4" "C"     "G"     "C"     "T"
 [7,] "2" "C"     "G"     "A"     "G"
 [8,] "3" "C"     "A"     "G"     "A"
 [9,] "2" "C"     "G"     "A"     "G"
[10,] "4" "C"     "G"     "C"     "T"
[11,] "3" "C"     "A"     "G"     "A"
[12,] "4" "C"     "G"     "C"     "T"

I am now having trouble identifying the column containing the identical letters between pairs. For example, for pairs 1 and 2 the columns containing the identical letters would be pos1234 and pos1678.

Would it be possible get a dataframe with just identical letters for each pair of individuals?

Thanks in advance.

Upvotes: 0

Views: 63

Answers (1)

Frank
Frank

Reputation: 66819

You can pass a function to combn:

res0 <- combn(nrow(data), 2, FUN = function(x) 
  names(data[-1])[ lengths(sapply(data[x,-1], unique)) == 1 ], simplify=FALSE)

which gives

[[1]]
[1] "pos1234" "pos1678"

[[2]]
[1] "pos1234" "pos1345"

[[3]]
[1] "pos1234" "pos1456"

[[4]]
[1] "pos1234"

[[5]]
[1] "pos1234" "pos1345"

[[6]]
[1] "pos1234"

To figure out which of these [[1]]..[[6]] correspond to which pairs, take combn again:

res <- setNames(res0, combn(data$ID, 2, paste, collapse="."))

which gives

$`1.2`
[1] "pos1234" "pos1678"

$`1.3`
[1] "pos1234" "pos1345"

$`1.4`
[1] "pos1234" "pos1456"

$`2.3`
[1] "pos1234"

$`2.4`
[1] "pos1234" "pos1345"

$`3.4`
[1] "pos1234"

Upvotes: 1

Related Questions