Reputation: 3411
I need to implement a python regex to capture two statements:
Case 2 is straightforward: r'[^\{\};]+
For case 1, I'm not sure. Nested brackets are not allowed but "hello world {it is} {me}" should be OK. The closest I have right now is r'.*?\{.*?\}.*?
, but it matches "an {apple}" but not "an {apple} boy". How do I correct for this?
These should be OK:
{a} {boy} lives here
{a} boy {lives} here
a {boy} lives here
a boy lives here
These are not OK:
{{a} boy lives here}
a boy {{lives}} here
a boy {lives} here {
a boy lives here }
Upvotes: 1
Views: 50
Reputation: 5055
Edit: the question as posed before the edition looked like balanced parenthesis problem - which is not solvable with regular expressions. See explanation below. But after the edit, it turned out that it's about working with only a single level of parenthesis, i.e. without nesting them. This is possible, and can be seen in anubhava's answer.
You can't.
Regular expressions (as defined by computer science) are unable to perform such task, as they, essentially, lack the "memory" required to do it.
Think of regular expression as a state machine with a finite amount of states. Each character you see in the input moves you from one state to the other.
As soon as you provide a long enough string of open parenthesis, i.e. of length larger than the number of states, you would have to land in a state that you already visited, leaving out any count of open parenthesis you've encountered.
The model of computation you are looking for here would be (at least) something called Context Free Grammar. A machine model that works with such grammars is called Push Down Automata (by analogy, in case of regular expressions it was Finite Automata).
Or maybe you can?
There are some flavours of Regex that doesn't follow the computer science term, and have additional features like recursive expressions. That would be able to capture the parenthesis and find whether all of them close in the right order.
An example of that can be seen here.
Upvotes: 1
Reputation: 785266
If your brackets are not nested and not escaped then you can use this regex to validate your input:
^(?:[^{}]*{[^}]*})*[^{}]*$
This regex will match an empty string also. If you want to avoid then use (?!$)
negative lookahead to disallow empty match:
^(?!$)(?:[^{}]*{[^}]*})*[^{}]*$
RegEx Details:
^
: Start(?:
: Start a non-capture group
[^{}]*
: Match 0 or more of any characters that is not {
and }
{[^}]*}
: Match a {...}
substring)*
: End non-capture group. *
lets this group repeat 0 or more times.[^{}]*
: Match 0 or more of any characters that is not {
and }
$
: ENdUpvotes: 1