Reputation: 61
I would like to capture the characters between the 1st and 2nd occurrence of '_' in this string:
C2_Sperd20A_XXX_20170301_20170331
That is:
Sperd20A
Thank you
Upvotes: 0
Views: 2042
Reputation: 887118
We can use sub
to match zero or more characters that are not a _
([^_]*
) from the start (^
) of the string followed by a _
followed by one or more characters that are not a _
(([^_]+)
) capture it as group ((...)
) followed by _
and other characters, replace with the backreference (\\1
) of the captured group
sub("^[^_]*_([^_]+)_.*", "\\1", str1)
#[1] "Sperd20A"
Or between the 2nd and 3rd _
sub("^([^_]*_){2}([^_]+).*", "\\2", str1)
#[1] "XXX"
Or another option is strsplit
strsplit(str1, "_")[[1]][2]
#[1] "Sperd20A"
If it is between 2nd and 3rd _
strsplit(str1, "_")[[1]][3]
#[1] "XXX"
###data
str1 <- "C2_Sperd20A_XXX_20170301_20170331"
Upvotes: 8
Reputation: 3053
A good option is to use the stringr
package:
library(stringr)
s <- "C2_Sperd20A_XXX_20170301_20170331"
# (?<=foo) Lookbehind
# (?=foo) Lookahead
str_extract(string = s, pattern = "(?<=_)(.*?)(?=_)")
[1] "Sperd20A"
Upvotes: 1