Reputation: 55
I have a boolean expression string, that I would like to take apart:
condition = "a and (b or (c and d))"
Or let's say:
I want to be able to access the string contents between two parenthesis.
I want following outcome:
"(b or (c and d))"
"(c and d)"
I've tried the following with regular expressions (not really working)
x = re.match(".*(\(.*\))", condition)
print x.group(1)
Question:
What is the nicest way to take a boolean expression string apart?
Upvotes: 4
Views: 203
Reputation: 11941
If your requirements are fairly simple, you don't really need a parser. Matching parentheses can easily be achieved using a stack.
You could do something like the following:
condition = "a and (b or (c and d))"
stack = []
for c in condition:
if c != ')':
stack.append(c)
else:
d = c
contents = []
while d != '(':
contents.insert(0, d)
d = stack.pop()
contents.insert(0, d)
s = ''.join(contents)
print(s)
stack.append(s)
produces:
(c and d)
(b or (c and d))
Upvotes: 2
Reputation: 7343
Build a parser:
Condition ::= Term Condition'
Condition' ::= epsilon | OR Term Condition'
Term ::= Factor Term'
Term' ::= epsilon | AND Factor Term'
Factor ::= [ NOT ] Primary
Primary ::= Literal | '(' Condition ')'
Literal ::= Id
Upvotes: 0
Reputation: 14553
Like everyone said, you need a parser.
If you don't want to install one, you can start from this simple top-down parser (take the last code sample here)
Remove everything not related to your need (+, -, *, /, is, lambda, if, else, ...). Just keep parenthesis, and
, or
.
You will get a binary tree structure generated from your expression.
The tokenizer use the build-in tokenize
(import tokenize
), which is a lexical scanner for Python source code but works just fine for simple cases like yours.
Upvotes: 2
Reputation: 599856
This is the sort of thing you can't do with a simple regex. You need to actually parse the text. pyparsing is apparently excellent for doing that.
Upvotes: 8