Reputation: 2443
I am using the following regex in a Java calculator program to tokenize input:
((?<=[(^+/*-])|(?=[(^+/*-]))
I was previously using this regex (note that the caret is moved to the end):
((?<=[(+/*-^])|(?=[(+/*-^]))
This one caused problems because multi-digit inputs would be cut up into individual characters. i.e., "11" would split into "1", "1".
I know that the caret is a special character at the front of a character class, but why does it cause the regex to work improperly when placed at the end?
Upvotes: 2
Views: 33
Reputation: 95958
In [(+/*-^]
, *-^
matches characters in the range *
to ^
, that's your problem.
But when you write [(^+/*-]
, it matches one of (
, ^
, +
, /
, *
or -
.
Clearer example:
[12a-z]
will match 1
, 2
or a character between a
and z
[12az-]
matches 1
, 2
, a
, z
or -
Upvotes: 4