Reputation: 197
I have some JavaScript code in string format. Target is such a string:
productPage.loadProductData("138674", "initial", "1");
How can I extract '138674'?
I'm using this line:
from re import search as re_search, sub as re_sub, compile as re_compile
print re_search(r'productPage.loadProductData("?P<pid>\d+","?P<x>\w+","?P<n>\d+");', open_link).groupdict()["pid"]
Upvotes: 0
Views: 96
Reputation: 174756
In Python (?P<name>regex)
is called a named capturing group. You forgot the opening and closing brace in the named capturing group. And also you need to escape (
in your regular expression to match a literal (
symbol.
>>> s = 'productPage.loadProductData("138674","initial","1");'
>>> print re.search(r'productPage.loadProductData\("(?P<pid>\d+)","(?P<x>\w+)","(?P<n>\d+)"\);', s).group("pid")
138674
OR
>>> print re.search(r'productPage.loadProductData\("(?P<pid>\d+)","(?P<x>\w+)","(?P<n>\d+)"\);', s).groupdict()["pid"]
138674
Upvotes: 1
Reputation: 172
Why do you want to do a regular expression on code, and don't use a specialized library to parse code - for example Esprima?
Esprima parses code and outputs it in JSON format so you can now extract the name of the functions, the variables passed to it, etc.
Upvotes: 0
Reputation: 474003
Aside from the regular expression-based approach, you can solve it with a slimit
JavaScript parser:
from slimit.ast import String
from slimit.parser import Parser
from slimit.visitors import nodevisitor
data = 'productPage.loadProductData("138674","initial","1");'
parser = Parser()
tree = parser.parse(data)
print next(node.value for node in nodevisitor.visit(tree) if isinstance(node, String))
This would output the first String
node out of the JavaScript code in the data
variable.
Upvotes: 0