Mohammad Reza
Mohammad Reza

Reputation: 21

parse page with beautifulsoup

I'm trying to parse this webpage and take some of information:

http://www.tsetmc.com/Loader.aspx?ParTree=151311&i=778253364357513

import requests
page = requests.get("http://www.tsetmc.com/Loader.aspx?ParTree=151311&i=778253364357513")

from bs4 import BeautifulSoup
soup = BeautifulSoup(page.content, 'html.parser')

All_Information = soup.find(id="MainContent")

print(All_Information)

it seams all information between tag is hidden. when i run the code this data is returned.

<div class="tabcontent content" id="MainContent">
<div id="TopBox"></div>
<div id="ThemePlace" style="text-align:center">
<div class="box1 olive tbl z2_4 h250" id="Section_relco" style="display:none"></div>
<div class="box1 silver tbl z2_4 h250" id="Section_history" style="display:none"></div>
<div class="box1 silver tbl z2_4 h250" id="Section_tcsconfirmedorders" style="display:none"></div>
</div>
</div>

Why is the information not there, and how can I find and/or access it?

Upvotes: 1

Views: 158

Answers (1)

Rusty Robot
Rusty Robot

Reputation: 1845

The information that I assume you are looking for is not loaded in your request. The webpage makes additional requests after it has initally loaded. There are a few ways you can get that information.

You can try selenium. It is a python package that simulates a web browser. This allows the page to load all the information before you try to scrape.

Another way is to reverse enginneer the website and find out where it is getting the information you need.

Have a look at this link. http://www.tsetmc.com/tsev2/data/instinfofast.aspx?i=778253364357513&c=57+

It is called by your page every few seconds, and it appears to contain all the pricing information you are looking for. It may be easier to call that webpage to get your information.

Upvotes: 1

Related Questions