Reputation: 145
I have a string in a vector like:
l <- c("0_Mango_10a" "0_Orange_10b" "0_Apple_11)
I need to extract Mango_10a, Orange_10b and Apple_11
My current code is :
stringr::str_extract(l, "(?<=_)[:alnum:]+")
And i get Mango, Orange and Apple.
Could any one help me getting the desired results.
Thanks in advance!
Upvotes: 0
Views: 244
Reputation: 388982
You can remove the text before the first underscore.
Using sub
in base R -
l <- c("0_Mango_10a" , "0_Orange_10b", "0_Apple_11")
sub('.*?_', '', l)
#[1] "Mango_10a" "Orange_10b" "Apple_11"
Or stringr::str_remove
.
stringr::str_remove(l, '.*?_')
Upvotes: 0
Reputation: 101335
Here are two base R options
> gsub("^\\d+_", "", l)
[1] "Mango_10a" "Orange_10b" "Apple_11"
> unlist(regmatches(l, gregexpr("(?<=_).*", l, perl = TRUE)))
[1] "Mango_10a" "Orange_10b" "Apple_11"
Upvotes: 0
Reputation: 887118
Just use trimws
from base R
by specifying the whitespace
as one or more digits(\\d+
) followed by underscore (_
)
trimws(l, whitespace = "\\d+_")
[1] "Mango_10a" "Orange_10b" "Apple_11"
With stringr
, str_remove
can be used
stringr::str_remove(l, "^\\d+_")
[1] "Mango_10a" "Orange_10b" "Apple_11"
In str_extract
, the pattern specified is only to match alphanumeric and not _
. If we include, it will work
stringr::str_extract(l, "(?<=_)[[:alnum:]_]+")
[1] "Mango_10a" "Orange_10b" "Apple_11"
Upvotes: 3