ThisIsNoZaku
ThisIsNoZaku

Reputation: 2443

What causes these two regexes to work differently?

I am using the following regex in a Java calculator program to tokenize input:

((?<=[(^+/*-])|(?=[(^+/*-]))

I was previously using this regex (note that the caret is moved to the end):

((?<=[(+/*-^])|(?=[(+/*-^]))

This one caused problems because multi-digit inputs would be cut up into individual characters. i.e., "11" would split into "1", "1".

I know that the caret is a special character at the front of a character class, but why does it cause the regex to work improperly when placed at the end?

Upvotes: 2

Views: 33

Answers (1)

Maroun
Maroun

Reputation: 95958

In [(+/*-^], *-^ matches characters in the range * to ^, that's your problem.

But when you write [(^+/*-], it matches one of (, ^, +, /, * or -.

Clearer example:

  • [12a-z] will match 1, 2 or a character between a and z

  • [12az-] matches 1, 2, a, z or -

Upvotes: 4

Related Questions