Reputation: 357
Banging my head here..
I am trying to parse the html source for the entire contents of javascript variable 'ListData' with regex which starts with the declaration var Listdata =
and ends with };
.
I found a solution which is similar:
Fetch data of variables inside script tag in Python or Content added from js
But I am unable to get it to match the entire regex.
Code:
# Need the ListData object
pat = re.compile('var ListData = (.*?);')
string = """QuickLaunchMenu == null) QuickLaunchMenu = $create(UI.AspMenu,
null, null, null, $get('QuickLaunchMenu')); } ExecuteOrDelayUntilScriptLoaded(QuickLaunchMenu, 'Core.js');
var ListData = { "Row" :
[{
"ID": "159",
"PermMask": "0x1b03cc312ef",
"FSObjType": "0",
"ContentType": "Item"
};
moretext;
moretext"""
#Returns NoneType instead of match object
print(type(pat.search(string)))
Not sure what is going wrong here. Any help would be appreaciated.
Upvotes: 1
Views: 1947
Reputation: 626929
In your regex, (.*?);
part matches any 0+ chars other than line break chars up to the first ;
. If there is no ;
on the line, you will have no match.
Basing on the fact your expected match ends with the first };
at the end of a line, you may use
'(?sm)var ListData = (.*?)};$'
Here,
(?sm)
- enables re.S
(it makes .
match any char) and re.M
(this makes $
match the end of a line, not just the whole string and makes ^
match the start of line positions) modesvar ListData =
(.*?)
- Group 1: any 0+ chars, as few as possible, up to the first...};$
- };
at the end of a lineUpvotes: 3