Reputation: 1231
I want to parse some PHP code, i've made regex which should split PHP code to atoms ( https://regex101.com/r/P074q8/1 ) but when I try to execute it python is unable to split this source code like on regex101 website.
Why my regex is working on regex101.com and does not want to work in actual python script?
main.py
import re
class PHPParser:
def __init__(self, filename):
# read php file
with open(filename, 'r') as f:
self._source = f.read()
syntax = [
r'/\*.*?\*/',
r'".*?"',
r'\'.*?\'',
r'\$[\w\d_]+', # variable name
r'\w+', # function name
r'return',
r'<\?php',
r'=>',
r'\?>',
r'\[',
r'\]',
r',',
r';',
r'\(',
r'\)',
r'\.',
r'\n',
r'\s',
r'=',
r'\W',
]
s = r'(' + r'|'.join(syntax) + r')'
print(s)
tokens = re.split(s, self._source, re.DOTALL | re.M | re.I | re.UNICODE)
print(tokens)
if __name__ == '__main__':
p = PHPParser('./vendor/yiisoft/yii2/base/Widget.php')
Upvotes: 0
Views: 627
Reputation: 1236
You can try this,
tokens = re.findall(s, self._source, re.DOTALL | re.M | re.I | re.UNICODE)
in which I simply repaced split()
function with findall()
, because you tried to get matching string in regex101.com
by same regex
, but in your python
script, you tried to split
by matching string.
Upvotes: 1