7660
7660

Reputation: 107

R use str_extract (stringr) to export a string between "_"

I have some strings in a vector like:

x <- c("ROH_Pete_NA_1_2017.zip",
   "ROH_Annette_SA_2_2016.zip",
   "ROH_Steve_MF_4_2015.zip")

I need to extract the names out of this strings (Pete, Annette, Steve) I would like to do this, in a loop and with str_extract()

all Strings starts with ROH_ but the length of the names are different and also the strings behind.

I would like to use str_extract() but I'm also happy for other solutions

Thank you for your help.

Upvotes: 3

Views: 3106

Answers (4)

Roman
Roman

Reputation: 17648

try the stringi package:

library(stringi)
stri_split_fixed(a,"_", simplify = T)[,2]
[1] "Pete"    "Annette" "Steve"  

Upvotes: 1

Sven Hohenstein
Sven Hohenstein

Reputation: 81713

Here is a solution with str_extract:

library(stringr)
str_extract(x, "(?<=_).+?(?=_)")
# [1] "Pete"    "Annette" "Steve"  

You can also use gsub in base R:

gsub("^.+?_|_.+$", "", x)
# [1] "Pete"    "Annette" "Steve"  

Upvotes: 6

Andre Elrico
Andre Elrico

Reputation: 11490

You can use base function sub.

sub("ROH_([[:alpha:]]+)_.*","\\1",x,perl=T)

[1] "Pete"    "Annette" "Steve"  

Upvotes: 0

Andrew Gustar
Andrew Gustar

Reputation: 18425

You are probably better off with str_match, as this allows capture groups. So you can add the _ either side for context but only return the bit you are interested in. The (\\w+?) is the capture group, and str_match returns this as the second column, hence the [,2] (the first column is what str_extract would return).

library(stringr)
str_match(x,"ROH_(\\w+?)_")[,2]

[1] "Pete"    "Annette" "Steve" 

Upvotes: 5

Related Questions