Reputation:
I know how to do it in Python, but can't get it to work in R
> string <- "this is a sentence"
> pattern <- "\b([\w]+)[\s]+([\w]+)[\W]*?$"
Error: '\w' is an unrecognized escape in character string starting "\b([\w"
> match <- regexec(pattern, string)
> words <- regmatches(string, match)
> words
[[1]]
character(0)
Upvotes: 2
Views: 5369
Reputation: 322
Python non regex version
spl = t.split(" ")
if len(spl) > 0:
s = spl[len(spl)-2]
Upvotes: 0
Reputation: 49448
sub('.*?(\\w+)\\W+\\w+\\W*?$', '\\1', string)
#[1] "a"
which reads - be non-greedy and look for anything until you get to the sequence - some word characters + some non-word characters + some word characters + optional non-word characters + end of string, then extract the first collection of word characters in that sequence
Upvotes: 6
Reputation: 331
Non-regex solution:
string <- "this is a sentence"
split <- strsplit(string, " ")[[1]]
split[length(split)-1]
Upvotes: 5