Reputation: 3574
Suppose I have a string x
like so.
x <- "CTTTANNNNNNNYG"
I would like to replace each letter in x with a different string that may not be f the same length.
a <- c("A","C","G","T","W","S","M","K","R","Y","B","D","H","V","N")
b <- c("A","C","G","T","(A|T)","(C|G)","(A|C)","(G|T)","(A|G)","(C|T)","(C|G|T)","(A|G|T)","(A|C|T)","(A|C|G)","(A|C|G|T)")
If I wanted to replace the letters in vector a with the corresponding ones in vector b, I would want to manipulate string x into:
"CTTTA(A|C|G|T)(A|C|G|T)(A|C|G|T)(A|C|G|T)(A|C|G|T)(A|C|G|T)(A|C|G|T)(C|T)G"
I've tried using mapply(gsub, a,b,x)
and str_replace()
to no avail. Any help would be appreciated.
Upvotes: 1
Views: 52
Reputation: 24480
Since replacements are "fixed" and involve each just one letter, you can achieve the same result without using neither regex
nor any additional packages. For instance:
vapply(strsplit(x,"",fixed=TRUE),function(z) paste(setNames(b,a)[z],collapse=""),"")
#[1] "CTTTA(A|C|G|T)(A|C|G|T)(A|C|G|T)(A|C|G|T)(A|C|G|T)(A|C|G|T)(A|C|G|T)(C|T)G"
Upvotes: 4
Reputation: 206187
If you wanted to do this with base functions, you need to basically do each of the replacements sequentially (gsub
isn't vectorized in this way). Here's one way to do that
Reduce(
function(x, replace) {
gsub(replace$pattern, replace$value, x)
},
Map(function(a,b) list(pattern=a, value=b), a, b),
init=x
)
# [1] "CTTTA(A|C|G|T)(A|C|G|T)(A|C|G|T)(A|C|G|T)(A|C|G|T)(A|C|G|T)(A|C|G|T)(C|T)G"
We use Map
to make pairs of match/replace values and then sequentially apply them to the string with Reduce
Upvotes: 2
Reputation: 886948
We can use mgsub
from library(qdap)
library(qdap)
mgsub(a, b, x)
#[1] "CTTTA(A|C|G|T)(A|C|G|T)(A|C|G|T)(A|C|G|T)(A|C|G|T)(A|C|G|T)(A|C|G|T)(C|T)G"
Upvotes: 4