Reputation: 245
I understand how to match a single String against multiple regex patterns using the pipe symbol as explained in some of the answers to this question: Match a string against multiple regex patterns
My question is that when I have the following String:
this_isAnExample of What nav-input a-autoid-9-announce thisIsAnExampleToo
And I use the following regex to extract text:
[A-Z][a-z]*|(?<=_)[A-Za-z-]*
I am expecting to get the following matches:
is
An
Example
What
Is
An
Example
Too
But I actually get is:
isAnExample
What
Is
An
Example
Too
Basically the engine is automatically linking the word An with Example bec it matches the underscore pattern but I want it to treat them as two words (non greedy?) bec according to the other pattern there is another match.
Upvotes: 0
Views: 1120
Reputation: 730
You probably ment the regex to be
[A-Z][a-z]*|(?<=_)[a-z-]*
The first part being lowercase word starting with uppercase letter, or the second: lowercase word preceded by underscore.
The part of your posted regex (?<=_)[A-Za-z-]*
matches lower and upper case letters after underscore, i.e. does not stop matching when uppercase letter met, which should be in fact start of another word.
Upvotes: 2
Reputation: 784878
You can use this alternation regex to capture all the lower case text that is wither preceded by _
OR mixed case text:
((?<=_)[a-z][a-z-]*|[A-Z][a-z]*)
Upvotes: 0