Ian
Ian

Reputation: 34549

Regex - what does the infinite error mean?

I'm sitting in on some technical interviews at the moment and we ask a question about checking that curly brackets are balanced (same number of opens/closes and that closes never proceed their matching open) within a string - asking people to write a small function to verify this.

A few candidates have considered trying to use Regex to solve this, then given up pretty quickly. I decided I wanted to give it a try, to see if it was possible. I'm currently using the following test strings:

Pass

{(function(r){ return r; })()} {}{}{}{} {{{{}}}}

Fail

}{ {{}}} {{{}}

I thought the following regex would work [^{}]*({[^{}]*})*[^{}]*. The idea was to match non-curly bracket characters , then match { then non-curly bracket characters, then }, repeating the bracketed match, and then finish with any non-curly bracket characters.

I seem to be getting an infinite error when using regexr.com though and I don't understand why:

enter image description here

Can anyone explain what is causing this exactly?

Upvotes: 3

Views: 489

Answers (2)

Eiji
Eiji

Reputation: 440

Update

This is regex you need (one level):

\{{0,1}[^\{\}]*\{[^\{\}]*\}[^\{\}]*\}{0,1}

I cannot write a nested rule, but you can say how many levels you need.

Explanation:

| - means alternative (a|b - matches "a" or "b" literally)
^ - means start of line (^a - matches any string that begin with "a" literally, "a taxi", "apple" ...)
$ - means end of line (a$ - matches any string that end with "a" literally, "umbrella")
\{ - matches character: "{" literally
\( - matches character: "(" literally

Upvotes: 0

Kevin Burdett
Kevin Burdett

Reputation: 2982

You are getting an infinite error because your regex can match any text. Since all of your groups are marked with *, they are all considered optional (* matches zero or more occurrences). This means that the engine can find zero occurrences of any group in your pattern and still consider the text a match.

Consider marking at least one group with +, which means "one or more" rather than "zero or more". Try this pattern:

[^{}]*(\{[^{}]*\})+[^{}]*

This way, the engine has some kind of restriction that it must match for your pattern to be accepted.

NOTE: it is also wise to escape { and } when not in a character block ([]). I have done this in the pattern above. Regexr.com doesn't seem to care, but some engines might produce a parse error without them.

Upvotes: 2

Related Questions