Reputation: 463
I'm trying to get the current "5 minute trend price" from the website of my electricity provider using Python2.7 and BeautifulSoup4.
The xpath is: xpath = "//html/body/div[2]/div/div/div[3]/p[1]"
or
<div class="instant prices">
<p class="price">
"5.2" # this is what I'm ultimately after
<small>¢</small>
<strong> per kWh </strong>
</p>
I've tried a myriad of different ways of getting the "5.2" value and have successfully been able to drill down to the "instant prices" object, but can't get anything from it.
My current code looks like this: import urllib2 from bs4 import BeautifulSoup
url = "https://rrtp.comed.com/live-prices/"
soup = BeautifulSoup(urllib2.urlopen(url).read())
#print soup
instantPrices = soup.findAll('div', 'instant prices')
print instantPrices
...and the output is:
[<div class="instant prices">
</div>]
[]
No matter what, it appears that the "instant prices" object is empty even though I can clearly see it when inspecting the element in Chrome. Any help would be hugely appreciated! Thank you!
Upvotes: 2
Views: 398
Reputation: 40993
Unfortunately this data is generated via Javascript when the browser renders the website. Thats why this information is not there when you download the source with urllib. What you can do is directly query the backend:
>>> import urllib2
>>> import re
>>> url = "https://rrtp.comed.com/rrtp/ServletFeed?type=instant"
>>> s = urllib2.urlopen(url).read()
"<p class='price'>4.5<small>¢</small><strong> per kWh </strong></p><p>5-minute Trend Price 7:40 PM CT</p>\r\n"
>>> float(re.findall("\d+.\d+", s)[0])
4.5
Upvotes: 2