Reputation: 107
I have some strings in a vector like:
x <- c("ROH_Pete_NA_1_2017.zip",
"ROH_Annette_SA_2_2016.zip",
"ROH_Steve_MF_4_2015.zip")
I need to extract the names out of this strings (Pete, Annette, Steve)
I would like to do this, in a loop and with str_extract()
all Strings starts with ROH_
but the length of the names are different and also the strings behind.
I would like to use str_extract()
but I'm also happy for other solutions
Thank you for your help.
Upvotes: 3
Views: 3106
Reputation: 17648
try the stringi
package:
library(stringi)
stri_split_fixed(a,"_", simplify = T)[,2]
[1] "Pete" "Annette" "Steve"
Upvotes: 1
Reputation: 81713
Here is a solution with str_extract
:
library(stringr)
str_extract(x, "(?<=_).+?(?=_)")
# [1] "Pete" "Annette" "Steve"
You can also use gsub
in base R:
gsub("^.+?_|_.+$", "", x)
# [1] "Pete" "Annette" "Steve"
Upvotes: 6
Reputation: 11490
You can use base function sub.
sub("ROH_([[:alpha:]]+)_.*","\\1",x,perl=T)
[1] "Pete" "Annette" "Steve"
Upvotes: 0
Reputation: 18425
You are probably better off with str_match
, as this allows capture groups.
So you can add the _
either side for context but only return the bit you are interested in. The (\\w+?)
is the capture group, and str_match
returns this as the second column, hence the [,2]
(the first column is what str_extract
would return).
library(stringr)
str_match(x,"ROH_(\\w+?)_")[,2]
[1] "Pete" "Annette" "Steve"
Upvotes: 5