Reputation: 679
I have the following string:
string <- c("ABDSFGHIJLKOP")
and list of substrings:
sub <- c("ABDSF", "SFGH", "GHIJLKOP")
I would like to include < and > after each sub match thus getting:
<ABD><SF><GH><GHIJKOP>
I have tried the following code by pattern matching over a list but as soon as ABDSF is matched SFGH is not recognised anymore because of the inclusion of the < > characters. Anybody have a better idea?
library(stringr)
library(dplyr)
library(magrittr)
string <- c("ABDSFGHIJLKOP")
sub <- c("ABDSF", "SFGH", "GHIJLKOP")
for (s in sub){
string %<>% str_replace_all(., s, paste0('<', s,'>'))
}
print(string)
Result: [1] "<ABDSF><GHIJLKOP>"
EDIT: The problem that I have with the above code is that as soon as the < > characters are inserted, after the first string match the second string SFGH is not recognised anymore because the string is now:
<ABDSF>GHIJLKOP.
So I am looking for a way to match the substrings ignoring the <> characters.
Upvotes: 0
Views: 211
Reputation: 269644
Place [<>]*
between successive characters in sub
and then perform the substituations with those patterns. No packages are used.
# test input
string <- "ABDSFGHIJLKOP"
subs <- c("ABDSF", "SFGH", "GHIJLKOP")
pats <- paste0("(", gsub("(?<=[EF])(.)(?=.)", "\\1[<>]*", subs, perl = TRUE), ")")
s <- string
for(p in pats) s <- gsub(p, "<\\1>", s)
s
## [1] "<ABD<SF><GH>IJLKOP>"
Regarding the comment below if I understand correctly we could add (?<=[EF])
giving:
pats <- paste0("(", gsub("(?<=[EF])(.)(?=.)", "\\1[<>]*", subs, perl = TRUE), ")")
s <- string
for(p in pats) s <- gsub(p, "<\\1>", s)
s
## [1] "<ABDSF><GHIJLKOP>"
Upvotes: 3
Reputation: 764
#R version 3.3.2
library(stringr)
library(magrittr)
string <- c("ABDSFGHIJLKOP")
sub <- c("ABDSF", "SFGH", "GHIJLKOP")
result <- c("")
for (s in sub){
temp<- c(str_extract(string, s))
if (!is.null(temp)) {
temp<- paste("<",temp,">",sep = "")
result <- paste(result,temp,sep = "")
}
}
print(result)
Result :
[1] "<ABDSF><SFGH><GHIJLKOP>"
Tested in Rextester
Upvotes: 0