maudulus
maudulus

Reputation: 11035

Splitting a math expression string into tokens in Python

I have a lot of python strings such as "A7*4", "Z3+8", "B6 / 11", and I want to split these strings so that they would be in a list, in the format ["A7", "*", "4"], ["B6", "/", "11"], etc. I have used a lot of different split methods but I think I need to just perform the split where there is a math symbol, such as /,*,+,-. I would also need to strip out the whitespace.

Currently I am using the code re.split(r'(\D)', "B6 / 11"), which is returning ['', 'B', '6', ' ', '', '/', '', ' ', '11']. Instead I want to get back ["B6", "/", "11"].

Upvotes: 7

Views: 6362

Answers (2)

Matthias
Matthias

Reputation: 13222

There is a way to solve this without regular expressions by using the Python tokenizer. I used a more complex formula to show the capabilities of this solution.

from io import StringIO
import tokenize

formula = "(A7*4) - (Z3+8) -  ( B6 / 11)"
print([token[1] for token in tokenize.generate_tokens(StringIO(formula).readline) if token[1]])

Result:

['(', 'A7', '*', '4', ')', '-', '(', 'Z3', '+', '8', ')', '-', '(', 'B6', '/', '11', ')']

Upvotes: 7

user2555451
user2555451

Reputation:

You should split on the character set [+-/*] after removing the whitespace from the string:

>>> import re
>>> def mysplit(mystr):
...     return re.split("([+-/*])", mystr.replace(" ", ""))
...
>>> mysplit("A7*4")
['A7', '*', '4']
>>> mysplit("Z3+8")
['Z3', '+', '8']
>>> mysplit("B6 / 11")
['B6', '/', '11']
>>>

Upvotes: 15

Related Questions