Reputation: 151
I would like to extract the last set of digits from a string without doing this.
"sdkjfn45sdjk54()ad"
str_remove("sdkjfn45sdjk54()ad","[:alpha:]+$")
[1] "sdkjfn45sdjk54()"
str_remove(str_remove("sdkjfn45sdjk54()ad","[:alpha:]+$"), "\\(")
[1] "sdkjfn45sdjk54)"
str_remove(str_remove(str_remove("sdkjfn45sdjk54()ad","[:alpha:]+$"), "\\("), "\\)")
[1] "sdkjfn45sdjk54"
str_extract(str_remove(str_remove(str_remove("sdkjfn45sdjk54()ad","[:alpha:]+$"), "\\("), "\\)"), "\\d+$")
[1] "54"
because the patterns are uncertain. I am aware that stringi has a str_extract_from_last function but I need to stick to base R or stringR.
Thanks!
Upvotes: 0
Views: 190
Reputation: 388982
You can use negative lookahead regex.
string <- "sdkjfn45sdjk54()ad"
stringr::str_extract(string, '(\\d+)(?!.*\\d)')
#[1] "54"
Using the same regex in base R :
regmatches(string, gregexpr('(\\d+)(?!.*\\d)', string, perl = TRUE))[[1]]
This extracts the set of numbers which is not followed by any number so last set of numbers.
Upvotes: 2
Reputation: 160447
Use str_extract_all
and grab just the last one in each vector.
library(stringr)
quux <- str_extract_all(c("a", "sdkjfn45sdjk54()ad"), "[0-9]+")
sapply(quux, `[`, lengths(quux))
# [1] NA "54"
I use sapply
because I'm guessing that you have more than one string. str_extract_all
will return a list
, where each element is zero or more strings extracted from the source. Since we're only interested in one of those, we can use sapply
.
One might be tempted to use sapply(., tail, 1)
, but if zero are found, then it will be character(0)
, not empty or NA
. I'm inferring that NA
would be a good return when the pattern is not found.
Upvotes: 1