Reputation: 403
I have a list like this:
list1 = list(data.frame("Gene" = c("A","B","C","D","E"), "Sample" = "S1"),
data.frame("Gene" = c("B","C","D","F","G"), "Sample" = "S2"),
data.frame("Gene" = c("A","C","D","E","F"), "Sample" = "S3"))
names(list1) = c("S1","S2","S3")
I would like to report which Sample
are present for each Gene
in the entire list1
. For example
$A
"S1","S3"
$B
"S1","S2"
$C
"S1","S2","S3"
$D
"S1","S2","S3"
$E
"S1","S3"
$F
"S2","S3"
$G
"S2"
There are no duplicated Gene
within the list, but there are common Gene
between the list and for each Gene
, I wanted to find out which Sample
in which it is present. Could someone help? Thank you.
Upvotes: 1
Views: 40
Reputation: 887203
We could use split
from base R
after rbind
ing the list
elements
with(do.call(rbind, list1), split(Sample, Gene))
#$A
#[1] "S1" "S3"
#$B
#[1] "S1" "S2"
#$C
#[1] "S1" "S2" "S3"
#$D
#[1] "S1" "S2" "S3"
#$E
#[1] "S1" "S3"
#$F
#[1] "S2" "S3"
#$G
#[1] "S2"
Upvotes: 0
Reputation: 20463
If you would prefer the output in more of a tibble
or data.frame
format you can use:
library(tidyverse)
bind_rows(list1) %>%
group_by(Gene) %>%
summarise(Samples = toString(Sample))
#> # A tibble: 7 x 2
#> Gene Samples
#> <chr> <chr>
#> 1 A S1, S3
#> 2 B S1, S2
#> 3 C S1, S2, S3
#> 4 D S1, S2, S3
#> 5 E S1, S3
#> 6 F S2, S3
#> 7 G S2
Or you could nest
them for further processing:
bind_rows(list1) %>%
group_by(Gene) %>%
nest()
#> # A tibble: 7 x 2
#> Gene data
#> <chr> <list>
#> 1 A <tibble [2 x 1]>
#> 2 B <tibble [2 x 1]>
#> 3 C <tibble [3 x 1]>
#> 4 D <tibble [3 x 1]>
#> 5 E <tibble [2 x 1]>
#> 6 F <tibble [2 x 1]>
#> 7 G <tibble [1 x 1]>
Upvotes: 0
Reputation: 79238
You can first use do.call(rbind,..)
to make the list into one dataframe then unstack
the dataframe:
unstack(do.call(rbind,list1),Sample~Gene)
$A
[1] "S1" "S3"
$B
[1] "S1" "S2"
$C
[1] "S1" "S2" "S3"
$D
[1] "S1" "S2" "S3"
$E
[1] "S1" "S3"
$F
[1] "S2" "S3"
$G
[1] "S2"
Upvotes: 2