Reputation: 377
I have a list of gene IDs along with their sequences in R.
$2435
[1]"ATGCGGGCGGGGGTCGTCGA"
$2435
[1]"ATGCGGCGCGCGCGCTATATACGC"
$2435
[1]"ATGCGGCGCCTCTCATCGCGGGGG"
I want to combine the sequences with the same gene IDs in that list in R.
$2435
[1]"ATGCGGGCGGGGGTCGTCGAATGCGGCGCGCGCGCTATATACGCATGCGGCGCCTCTCATCGCGGGGG"
Upvotes: 2
Views: 88
Reputation: 12819
Bonus:
For a dataframe output, use this:
aggregate(unlist(A), by=list(id=names(A)), paste, collapse="")
Where A
is you list.
Using @Ananda's A
, I get this:
id x
1 10 FFFFGGGGHHHHIIII
2 12 AAAABBBBCCCCDDDDXXXXXXXXXXXXXXXXXXXXXXX
3 34 GGGG
Upvotes: 2
Reputation: 48211
l <- list("A" = "ABC", "B" = "XYX", "A" = "DEF", "C" = "YZY", "A" = "GHI")
tapply(l, names(l), paste, collapse = "", simplify = FALSE)
# $A
# [1] "ABCDEFGHI"
#
# $B
# [1] "XYX"
#
# $C
# [1] "YZY"
Upvotes: 2
Reputation: 193517
Use lapply
after matching the names with unique
. Here's some sample data:
A <- list("12" = "AAAABBBBCCCCDDDD",
"34" = "GGGG",
"12" = "XXXXXXXXXXXXXXXXXXXXXXX",
"10" = "FFFFGGGG",
"10" = "HHHHIIII")
A
# $`12`
# [1] "AAAABBBBCCCCDDDD"
#
# $`34`
# [1] "GGGG"
#
# $`12`
# [1] "XXXXXXXXXXXXXXXXXXXXXXX"
#
# $`10`
# [1] "FFFFGGGG"
#
# $`10`
# [1] "HHHHIIII"
Subset the related names
and paste
them together.
lapply(unique(names(A)), function(x) paste(A[names(A) %in% x], collapse = ""))
# [[1]]
# [1] "AAAABBBBCCCCDDDDXXXXXXXXXXXXXXXXXXXXXXX"
#
# [[2]]
# [1] "GGGG"
#
# [[3]]
# [1] "FFFFGGGGHHHHIIII"
Upvotes: 2