Reputation: 18705
I'm working on a python program which would scrape data (public data) from web pages. The problem is when I want to get source code of a web page which is accessable using button and it's based on ASP.NET. I can't just parse a href from the page as usual.
So my question is: Is there a simple way how to get the source code of the ASP.NET page?
To explain it clearly I'm attaching one web page based on ASP.NET: In this case I want to get soure code of page which is displayed when I click on "Radiátor topení (1)" in the middle of the page. You can see the parent page where is the button on which I want to simulate click here!
I was trying to check source code of this (parent) page and look for some url near the "Radiátor topení (1)" text but I've found only this:
<td class="CatalogCell"><a onclick=" return PathClick('3761801;176564;356239;922141;922488;922507;922508')"><H2 class="CatalogH">Radiátor topení (1)</H2></a></td>
and I'm afraid, this wont help me.
I'm looking for a simpliest way because I'm not expert in ASP.NET nor Javascript. Thanks for advices!
Upvotes: 2
Views: 670
Reputation: 82
The program is in python, which gives the html source of the link.
import urllib2
from bs4 import BeautifulSoup
link="http://www.example.com"
hdr = {'User-Agent': 'Mozilla/5.0'}
req = urllib2.Request(link,headers=hdr)
page = urllib2.urlopen(link)
soup = BeautifulSoup(page,'html.parser')
print soup
Upvotes: 1