Reputation: 51
Could anybody explain why "aba12" shows up, when I have specified {2}
?
strings=c("Ab12","aba12","BA12","A 12b","B!","d", " ab")
grep("^[[:alpha:]]{2}", strings, value=TRUE)
Upvotes: 3
Views: 105
Reputation: 66819
You can use ...
grep("^[[:alpha:]]{2}[^[:alpha:]]", strings, value=TRUE)
# [1] "Ab12" "BA12"
[...]
enumerates accepted characters and [^...]
negates it. Further, from @Mako212:
^[[:alpha:]]{2}
[...] tells the Regex engine to match the beginning of the string, then exactly two ASCII A-Z/a-z characters. It asserts nothing about the remainder of the string. Regex will process the remainder of the string, but there is no remaining criteria to match
My answer above expects a non-alpha character following the initial two. From MrFlick's comment:
If you also want to match "AB", then use
grep("^[[:alpha:]]{2}([^[:alpha:]]|$)", strings, value=TRUE)
to match a non-alpha character or end of string.
Upvotes: 3