Reputation: 372
I need to extract from a text all the words which match these two requirements:
So, Word and WorD are correct captures, but word and WORD aren't.
So, I can capture all the words using a \b([a-zA-Z]+)\b
Regex, but I don't know how to add the uppercase letters condition here.
As about the requirement #1, I tried to use a positive lookahead here like this:
\b(?=.*[A-Z]+)([a-zA-Z]+)\b
, but now it captures all the words from a line if this line has at least one uppercase letter.
Is it even possible to apply additional conditions to a capturing group? I can process this in my application's code but I'd really prefer to fit all those requirements in a single Regex.
Upvotes: 1
Views: 451
Reputation: 626690
You may use
\b(?=[A-Z]*[a-z])(?=[a-z]*[A-Z])([a-zA-Z]+)\b
See the regex demo
Actually, you do not even need the capturing group, ([a-zA-Z]+)
can be usually replaced with [a-zA-Z]+
, but it depends where you are using the regex.
Details
\b
- word boundary(?=[A-Z]*[a-z])
- a positive lookahead that requires a lowercase letter after 0+ uppercase ones(?=[a-z]*[A-Z])
- a positive lookahead that requires a uppercase letter after 0+ lowercase ones([a-zA-Z]+)
- Group 1: 1 or more letters\b
- a word boundary.Upvotes: 1