Sid
Sid

Reputation: 4055

Requests won't get the text from web page?

I am trying to get the value of VIX from a webpage.

The code I am using:

raw_page = requests.get("https://www.nseindia.com/live_market/dynaContent/live_watch/vix_home_page.htm").text
soup = BeautifulSoup(raw_page, "lxml")
vix = soup.find("span",{"id":"vixIdxData"})
print(vix.text)

This gives me:

' '

If I see vix,

<span id="vixIdxData" style=" font-size: 1.8em;font-weight: bold;line-height: 20px;">/span>

On the site the element has text,

<span id="vixIdxData" style=" font-size: 1.8em;font-weight: bold;line-height: 20px;">15.785/span>

The 15.785 value is what I want to get by using requests.

Upvotes: 1

Views: 2502

Answers (2)

SergiyKolesnikov
SergiyKolesnikov

Reputation: 7815

When you open the page in a web browser, the text (e.g., 15.785) is inserted into the span element by the getIndiaVixData.js script.

When you get the page using requests in Python, only the HTML code is retrieved and no JavaScript processing is done. So, the span element stays empty.

It is impossible to get that data by solely parsing the HTML code of the page using requests.

Upvotes: 0

Keyur Potdar
Keyur Potdar

Reputation: 7238

The data you're looking for, is not available in the page source. And requests.get(...) gets you only the page source without the elements that are dynamically added through JavaScript. But, you can still get it using requests module.

In the Network tab, inside the developer tools, you can see a file named VixDetails.json. A request is being sent to https://www.nseindia.com/live_market/dynaContent/live_watch/VixDetails.json, which returns the data in the form of JSON.

enter image description here

You can access it using the built-in .json() function of the requests module.

r = requests.get('https://www.nseindia.com/live_market/dynaContent/live_watch/VixDetails.json')
data = r.json()
vix_price = data['currentVixSnapShot'][0]['CURRENT_PRICE']
print(vix_price)
# 15.7000

Upvotes: 1

Related Questions