Reputation: 7989
In a string
string="aaaaaaaaaSTARTbbbbbbbbbbSTOPccccccccSTARTddddddddddSTOPeeeeeee"
I would like to remove all parts that occur between START and STOP, yielding
"aaaaaaaaacccccccceeeeeee"
if I try with
gsub("START(.*)STOP","",string)
this gives me
"aaaaaaaaaeeeeeee"
though.
What would be the correct way to do this, allowing for multiple occurrences of START and STOP?
Upvotes: 3
Views: 125
Reputation: 3622
Not nearly as elegant as Ananda's answer, but there are some other ways using the stringr & plyr packages.
library(stringr)
library(plyr)
start <- ldply(str_locate_all(string, 'START'))[1, 1]
end <- ldply(str_locate_all(string, 'STOP'))
end <- end[nrow(end), 2]
expression <- str_sub(string, start, end)
str_replace(string, expression, '')
Upvotes: 0
Reputation: 193527
Add a ?
in there too.
gsub("START.*?STOP", "", string)
# [1] "aaaaaaaaacccccccceeeeeee"
Upvotes: 3