Reputation: 22254
I'd like to match the lowercase version of an uppercase character in a backreference in a regex. For example, let's say I want to match a string where the 1st character is any uppercase character and the 4th character is the same letter as the first except it's a lowercase character. If I use grep
with this regex:
grep -E "([A-Z])[a-z]{2}\1[a-z]"
it would match "EssEx"
and "SusSe"
for instance. I'd like to match "Essex"
and "Susse"
instead. Is it possible to modify the above regular expression to achieve this ?
Upvotes: 4
Views: 312
Reputation: 8413
This is one of the cases where inline modifiers come in handy. Here is a solution that makes use of a case-senstive lookahead to check, that it is not exactly the same (uppercase) character and a case-insensitive backreference to match the fitting lowercase letter:
([A-Z])[a-z]{2}(?-i)(?!\1)(?i)\1[a-z]
Note that the (?-i) most likely isn't needed, but it's there for clarity. Inline modifiers are not supported by all regex flavours. PCRE supports it, so you will have to use -P
with grep.
Upvotes: 2