Reputation: 690
I have some code that is parsing some text and I have a sticking point with my regex expression sometimes capturing two things instead of one due to wonky data.
temp <- "abc abcdef"
library(stringr)
str_extract_all(temp,"ab.+")
[[1]]
[1] "abc abcdef"
str_extract_all(temp,"ab.+")[[1]][2]
[1] NA
Above is a simple example that I am working with. when I rapply this function, I may get 1,2 or 3 matches. The last match will be most important for my usage but I am not sure how to reference it.
Upvotes: 0
Views: 1090
Reputation: 5958
you can use for example:
str_extract_all(temp,"ab.+")[[length(str_extract_all(temp,"ab.+"))]]
Upvotes: 1
Reputation: 47320
Not very elegant but it gets the job done:
. <- str_extract_all(temp,".*?(?=(ab)|$)")[[1]]
paste0("a",.[[length(.)-1]])
# [1] "abcdef"
Or maybe you wanted something like this if your output can be only a word ?
. <- str_extract_all(temp,"\\bab.+?\\b")[[1]]
dplyr::last(.)
#[1] "abcdef"
Upvotes: 1
Reputation: 48211
As I understand, you mean something like
txt <- "bag of flour"
str_extract_all(txt, "\\b[a-z]+\\b")
# [[1]]
# [1] "bag" "of" "flour"
and referring to "flour". In that case you may use
tail(str_extract_all(txt, "\\b[a-z]+\\b")[[1]], 1)
# [1] "flour"
Upvotes: 2