Reputation: 13
How do I separate something like this:
; Remove this line
(?A or :B
(G + D))
Removing the lines with ; and separating tokens by spaces (removing spaces) and '(' or ')' as delimiters but keeping them using regex in python.
The end result should be something like:
['(', '?A', 'or', ':B', '(', 'G', '+', 'D', ')', ')']
But I can't eliminate the ';' line nor separate the '(', ')' tokens as their own.
So far I have this:
re.split('[;.*]*[^()\[\]:?a-zA-Z0-9-]+', text)
Upvotes: 1
Views: 154
Reputation: 626738
You may use
import re
rx = r'^;.*|([()])|\s+'
s = """; Remove this line
(?A or :B
(G + D))"""
print(list(filter(None, re.split(rx, s, flags=re.M))))
# => ['(', '?A', 'or', ':B', '(', 'G', '+', 'D', ')', ')']
See the Python demo
Details
^;.*
- start of a line (flags=re.M
will make ^
match start of lines, too) and then ;
and any 0 or more chars other than line break chars|
- or([()])
- Capturing group 1 (once captured, the matches will be output within the resulting list): a (
or )
char|
- or \s+
- 1+ whitespaces (not captured, hence, these matches will be left out).Upvotes: 1