Reputation: 1097
I'm working with multiple expressions that look like this
C=>E
or A+B+C=>D
or A+B<=>C
and (F|G)+H=>E
. I am trying to use re.split()
to split on =>
or <=>
. Furthermore I want to also split along the 3 operators + | ^
while not touching what's inside brackets.
First attempt, I've tried this
re.split(r"<=>|=>", "A+B+C=>D")
but the problem with this is it splits a line like A+B+C=>D
to
["A+B", "D"]
whereas I'm trying to achieve
["A+B", "=>", "D"]
and also with the problem regarding operators when I try to split (A+B)|C=>D
like this
re.split(r"\+|=>|<=>|\^|\|", "(A+B)|C=>D")
I get
["(A", "B)", "C", "D"]
whereas I'm trying to achieve
["(A + B)", "|", "C", "=>", "D"]
I'm not very good with regex so I need help with possibly a regular expression robust enough to do this in one go. If it's not possible with regex, at least a better way of doing it.
Upvotes: 1
Views: 1149
Reputation: 626929
You may use
re.findall(r'\([^()]*\)|<?=>|[-+/*|^]|\w+', s)
See the regex demo and the Regulex graph:
Details
\([^()]*\)
- a parenthesized substring|
- or<?=>
- a <=>
or =>
|
- or[-+/*|^]
- one of the chars defined in the character class (to match any non-word and non-whitespace char, you may replace it with [\w\s]
)|
- or\w+
- word chars, 1 or more (you may precise it as you need: [A-Z]+
will match 1 or more uppercase letters, [a-zA-Z]+
will match 1+ letters)Upvotes: 1
Reputation: 81614
All you need is a capture group:
import re
print(re.split(r"'(\^|=>)", "A+B+C=>D"))
# ['A+B+C', '=>', 'D']
Upvotes: 1