Reputation: 199
I would like to detect one or more repeated characters within a class of characters, but not a combination of unique characters within the class. In the example below, we're looking for instances of p's, t's, or k's before an r. All three words satisfy the regular expression below, but I would like to exclude cases like bektri where we have two different consonants before r.
example <- c("betri", "bettri", "bektri")
str_detect(example, "[ptk]r")
So betri and bettri are good, but bektri is bad. Any tips?
Upvotes: 2
Views: 715
Reputation: 999
You can use a negative lookbehind (?<!)
to exclude matches when your letter combinations are preceded by k
.
example <- c("betri", "bettri", "bektri")
str_detect(example, "(?<!k)[ptk]r")
[1] TRUE TRUE FALSE
Edit:
I notice that I misread your post and you need to exclude matches when you have two different consonants before r
.
Then I would use the following regex: (?<![^aeuioy])([^aeuioy])\\1?r
. It will match any single or duplicate consonants before r
, whether it's at the beginning of the word or in the middle of it.
Upvotes: 1
Reputation: 3729
How about this?
library(stringr)
example <- c("betri", "bettri", "bektri")
str_detect(example, "([ptk])(\\1+)r|([^ptk])([ptk])r")
#> [1] TRUE TRUE FALSE
([ptk])\\1{1}r
matches p, t, or k two times before an r;
(\\1{1}
matches one character from the preceding group--([ptk])
;
([^ptk])([ptk])r
matches a p, t, or k before an r when it is not preceded by a p, t, or k.
You could also generalize to include any consonant that follows that pattern:
library(stringr)
example <- c("betri", "bettri", "bektri", "aepro", "aepo", "aeppro")
str_detect(example, "([[b-df-hj-np-tv-z]])(\\1+)r|([^[b-df-hj-np-tv-z]])([[b-df-hj-np-tv-z]])r")
#> [1] TRUE TRUE FALSE TRUE FALSE TRUE
Upvotes: 1