Reputation: 3461
I submit a query on a web page. The query takes several seconds before it is done. Only when it is done does it display an HTML table that I would like to get the information from. Let's say this query takes a maximum of 4 seconds to load. While I would prefer to get the data as soon as it is loaded, it would be acceptable to wait 4 seconds then get the data from the table.
The issue I have is when I make my urlread request, the page hasn't finished loading yet. I tried loading the page, then issuing a sleep command, then loading it again, but that does not work either.
My code is
import urllib.request
import time
uf = urllib.request.urlopen(urlname)
time.sleep(3)
uf.decode('UTF-8')
text = uf.read()
print (text)
The webpage I am looking at is http://bookscouter.com/prices.php?isbn=9781111835811 (feel free to ignore the interesting textbook haha)
And I am using Python 3.X on a Raspberry Pi
Upvotes: 1
Views: 507
Reputation: 184250
The prices you want are not in the page you're retrieving, so no amount of waiting will make them appear. Instead, the prices are retrieved by a JavaScript in that page after it has loaded. The urllib
module is not a browser, so it won't run that script for you. You'll want to figure out what the URL is for the AJAX request (a quick look at the source code gives a pretty big hint) and retrieve that instead. It's probably going to be in JSON format so you can just use Python's json
module to parse it.
Upvotes: 4