Миша Попов
Миша Попов

Reputation: 137

Build regular expression

I need to build a regular expression for strings in which the +/- characters cannot stand side by side (must be separated by some other). I got this option: (a*(+|-)a)* , where a is any character, * is the Kleene closure, () is for clarity. But this expression does not recognize lines of the form: "+", "-","+ a-" etc. Maybe someone will be able to move me from the dead point. I need regularity to build a finite automaton.

Upvotes: 2

Views: 356

Answers (1)

Nikolay Handzhiyski
Nikolay Handzhiyski

Reputation: 1579

That might do:

^(\+(?![\+-])|-(?![\+-])|[^\+-])*?$

It bounds everything to the single line: ^ to $. The lazy quantifier ()*? ensures that no more then one line is going to be recognized.

The 3 concatenations inside the parentheses are as follows:

  • \+(?![\+-]) if the character is + the next one must not be + or -
  • -(?![\+-]) if the character is - the next one must not be + or -
  • the previous two have the second character only looked-ahead, and could be combined into one concatenation: [\+-](?![\+-]).
  • [^\+-] any character that is not + and -

However, you must know that a regex is more powerful than a regular expression. You need a regular grammar more than a regular expression:

S = +T
S = -T
S = @S
S = ε

T = @S
T = ε

This grammar is right-regular, and @ is any character that is not + nor -. ε is epsilon = nothing.

Here is the deterministic finite automaton (where P=+ M=- @=not + and not -):

  /---- @ ----\
  |           |
  |           v
(S = start/final)--- P,M -->(T = final)---- P,M --->(error)
      ^                          |         
      |                          |         
      \----------- @ ------------/         

Upvotes: 3

Related Questions