Reputation: 1162
I have a lot of strings that all looking similar, e.g.:
x1= "Aaaa_11111_AA_Whatiwant.txt"
x2= "Bbbb_11111_BBBB_Whatiwanttoo.txt"
x3= "Ccc_22222_CC_Whatiwa.txt"
I would like to extract the: Whatiwant
, Whatiwanttoo
, and the Whatiwa
in R.
I started with substring(x1,15,23)
, but I don't know how to generalize it. How can I always extract the part between the last _
and the .txt
?
Thank you!
Upvotes: 0
Views: 422
Reputation: 170
You can also use the stringr library with funtions like str_extract (and many other possibilities) only in case you don't get into regular expressions. It is extremely easy to use
x1= "Aaaa_11111_AA_Whatiwant.txt"
x2= "Bbbb_11111_BBBB_Whatiwanttoo.txt"
x3= "Ccc_22222_CC_Whatiwa.txt"
library(stringr)
patron <- "(What)[a-z]+"
str_extract(x1, patron)
## [1] "Whatiwant"
str_extract(x2, patron)
## [1] "Whatiwanttoo"
str_extract(x3, patron)
## [1] "Whatiwa"
Upvotes: 0
Reputation: 21443
You can use regexp
capture groups:
gsub(".*_([^_]*)\\.txt","\\1",x1)
Upvotes: 2