Reputation: 39
When I view a page source I am trying to extract the following data from the site using BeautifulSoup but I am unable to locate it using soup so am looking for some guidance.
When I view the source the page displays the following text.
var = 'SynchronizerToken';
var = 'dd3a0c31e365c458d2d3e68e3c98f772bd2103eccf381';
The code I am using now is
SynchronizerToken = soup.find_all("VAR SYNCHRONIZER_TOKEN_VALUE")
Advice is appreciated, thanks again!
Upvotes: 1
Views: 246
Reputation: 5157
You can use the following regex pattern to find the wanted value:
SYNCHRONIZER_TOKEN_VALUE = \'(.*?)\'
Upvotes: 0
Reputation: 369334
Using regular expression capturing group:
var SYNCHRONIZER_TOKEN_VALUE = '(.+?)'
, you can get the captured group using <MatchObject>.group(1)
import re
html = '''
var SYNCHRONIZER_TOKEN_NAME = 'SynchronizerToken';
var SYNCHRONIZER_TOKEN_VALUE = 'dd3a0c31e365c458d2d3e68e3c98f772bd2103eccf38163e10ce039c2b70a61a';
'''
token = None
matched = re.search(r"var SYNCHRONIZER_TOKEN_VALUE = '(.+?)'", html)
if matched:
token = matched.group(1)
# token => 'dd3a0c31e365c458d2d3e68e3c98f772bd2103eccf38163e10ce039c2b70a61a'
Upvotes: 1