wonderstruck80
wonderstruck80

Reputation: 357

Get javascript variable with python

Banging my head here..

I am trying to parse the html source for the entire contents of javascript variable 'ListData' with regex which starts with the declaration var Listdata = and ends with };.

I found a solution which is similar:

Fetch data of variables inside script tag in Python or Content added from js

But I am unable to get it to match the entire regex.

Code:

# Need the ListData object
pat = re.compile('var ListData = (.*?);')

string = """QuickLaunchMenu == null) QuickLaunchMenu = $create(UI.AspMenu, 
null, null, null, $get('QuickLaunchMenu')); } ExecuteOrDelayUntilScriptLoaded(QuickLaunchMenu, 'Core.js');
var ListData = { "Row" : 
[{
"ID": "159",
"PermMask": "0x1b03cc312ef",
"FSObjType": "0",
"ContentType": "Item"
};
moretext;
moretext"""

#Returns NoneType instead of match object
print(type(pat.search(string)))

Not sure what is going wrong here. Any help would be appreaciated.

Upvotes: 1

Views: 1947

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626929

In your regex, (.*?); part matches any 0+ chars other than line break chars up to the first ;. If there is no ; on the line, you will have no match.

Basing on the fact your expected match ends with the first }; at the end of a line, you may use

'(?sm)var ListData = (.*?)};$'

Here,

  • (?sm) - enables re.S (it makes . match any char) and re.M (this makes $ match the end of a line, not just the whole string and makes ^ match the start of line positions) modes
  • var ListData =
  • (.*?) - Group 1: any 0+ chars, as few as possible, up to the first...
  • };$ - }; at the end of a line

Upvotes: 3

Related Questions