Reputation: 21
I have strings like this: "X96HE6.10nMBI_1_2", "X96HE6.10nMBI_2_2", "X96HE6.10nMBI_3_2" and I would like to match only numbers 1, 2 and 3 in between underscores but without them(underscores). The best solution I could come up with is this str_match(sample_names, "_+[1-3]?")
I would really appreciate the help.
Upvotes: 0
Views: 141
Reputation: 21400
The simplest method is by using sub
and backreference:
Data:
d <- c("X96HE6.10nMBI_1_2", "X96HE6.10nMBI_2_2", "X96HE6.10nMBI_3_2")
Solution:
sub(".*_(\\d)_.*", "\\1", d)
Here, (\\d)
defines the capturing group for a single number (if the number in question can be more than one digit, use \\d+
) that is 'recalled' by the backreference \\1
in sub
s replacement argument
Alternatively use str_extract
and positive lookaround:
library(stringr)
str_extract(d, "(?<=_)\\d(?=_)")
(?<=_)
is positive lookbehind which can be glossed as "If you see _
on the left..."
\\d
is the number to be matched
(?=_)
is positive lookahead, which can be glossed as "If you see _
on the right..."
Result:
[1] "1" "2" "3"
Upvotes: 2
Reputation: 43169
No need for any third-party module:
strings <- c("X96HE6.10nMBI_1_2", "X96HE6.10nMBI_2_2", "X96HE6.10nMBI_3_2")
pattern <- "(?<=_)(\\d+)(?=_)"
unlist(regmatches(strings, gregexpr(pattern, strings, perl = TRUE)))
Which yields:
[1] "1" "2" "3"
Upvotes: 1
Reputation: 269441
Using x
in the Note at the end, read it in using read.table
and pick off the second field. No packages or regular expressions are used.
read.table(text = x, sep = "_")[[2]]
## [1] 1 2 3
x <- c("X96HE6.10nMBI_1_2", "X96HE6.10nMBI_2_2", "X96HE6.10nMBI_3_2")
Upvotes: 1
Reputation: 4151
You can use Look Arounds, I personally rely heavily on the stringr Cheatsheets for these kind of regex, the syntax is a bit hard to remember, here is the rstudio page for Cheatsheets look for stringr ->LOOK AROUNDS
library(tidyverse)
codes <- c("X96HE6.10nMBI_1_2", "X96HE6.10nMBI_2_2", "X96HE6.10nMBI_3_2")
codes %>%
str_extract("(?<=_)[:digit:]+(?=_)")
#> [1] "1" "2" "3"
Created on 2020-06-14 by the reprex package (v0.3.0)
Upvotes: 1