Reputation: 303
I am trying to isolate a portion of a string in R. The strings have the form ABC_constantStuff_ABC_randomStuff
and ABC
is what I am trying to extract. ABC
is unknown and can be 1-3 characters long. I've been trying grep
and gsub
but am unsure how to specify my regular expression using
str <- 'GDP\" title=\"GDP - News\"></a>"'
symbol <- gsub(pattern,'',str)
Here GDP
is unknown and can be 1-3 characters long, \" title=\"
is constant in every string and I would like to remove \" title=\"GDP - News\"></a>"
Thank you for help in advance.
Upvotes: 3
Views: 3866
Reputation: 368241
A simple one is
R> gsub("^([A-Z]*)_.*", "\\1", "ABC_constantStuff_ABC_randomStuff")
[1] "ABC"
R>
which gets all letters up to the first _
.
Another one assumming _
is your separator is
R> strsplit( "ABC_constantStuff_ABC_randomStuff", "_")[[1]][c(1,3)]
[1] "ABC" "ABC"
R>
Upvotes: 4