littleworth
littleworth

Reputation: 5169

How to extract multiple substrings in a string using stringr regex

I have this string:

mystring <- "HMSC-bm_in_ALL_CELLTYPES.distal"

What I want to do is to extract the substring as defined in this bracketing

[HMSC-bm]_in_ALL_CELLTYPES.[distal]

So in the end it will yield a vector with two values: HMSC-bm and distal. How can I do it? I tried this but failed:

> stringr::str_extract(base,"\\([\\w-]+\\)_in_ALL_CELLTYPES\\.\\([\\w+]\\)")
[1] NA

Upvotes: 1

Views: 1160

Answers (2)

www
www

Reputation: 39174

We can split the string by _in_ALL_CELLTYPES..

strsplit(mystring, split = "_in_ALL_CELLTYPES.")[[1]]
[1] "HMSC-bm" "distal" 

Upvotes: 2

neilfws
neilfws

Reputation: 33802

I'd use str_match:

library(stringr)
mymatch <- str_match(mystring, "^(.*?)_.*?\\.(.*?)$")
mymatch

     [,1]                              [,2]      [,3]    
[1,] "HMSC-bm_in_ALL_CELLTYPES.distal" "HMSC-bm" "distal"

mymatch[, 2]
[1] "HMSC-bm"

mymatch[3, ]
[1] "distal"

Upvotes: 3

Related Questions