Reputation: 43
suppose I have the next string:
"palavras a serem encontradas fazer-se encontrar-se, enganar-se"
How can I extract the words "fazer-se" "encontrar-se" "enganar-se"
I'm try o use stringr like
library(stringr)
sentence <- "palavras a serem encontradas fazer-se encontrar-se, enganar-se"
str_extract_all(sentence, "se$")
I'd like this output:
[1] "fazer-se" "encontrar-se" "enganar-se"
Upvotes: 2
Views: 213
Reputation: 388907
In base R, we can use gregexpr
and regmatches
:
regmatches(sentence, gregexpr('\\w+-se', sentence))[[1]]
#[1] "fazer-se" "encontrar-se" "enganar-se"
Upvotes: 0
Reputation: 887048
We can specify the word boundary (\\b
) and not the end ($
) of the string (there is only one match for that, i.e. at the end of the string) and we need to get the characters that are not a whitespace before the se
substring, so use \\S+
i.e. one or more non-whitespace characters
library(stringr)
str_extract_all(sentence, "\\S+se\\b")[[1]]
#[1] "fazer-se" "encontrar-se" "enganar-se"
Upvotes: 2