Reputation: 1510
I'm trying to scrape the weather data from this website:
http://www.fastweather.com/yesterday.php?city=St.+Louis_MO
The problem I have run into is Yesterday's Precipitation. When viewed in the developer tools, I see the following:
<strong>Yesterday's Precipitation</strong>
was 0.13 inches
But when viewing it from Python, either using Requests or the urllib modules, I see this:
<strong>Yesterday\'s Precipitation</strong>
was T inches
I use NoScript in my browser, and I disallowed all JavaScript from running, but the 0.13 still appears. Where is this number coming from, and how do I obtain it with Python?
I'm on a Unix system, and this will be a daily script to run. I would like to avoid Selenium, if possible.
Even if there are other websites to use, I would like to know why that mysterious T exists.
Here's my relevant code:
webpage = requests.get("http://www.fastweather.com/yesterday.php?city=St.+Louis_MO")
if webpage.status_code == 200:
content = str(webpage.content)
I have also tried this:
with requests.Session() as session:
webpage = session.get("http://www.fastweather.com/yesterday.php?city=St.+Louis_MO")
content = webpage.text
And this:
webpage = urllib.request.urlopen("http://www.fastweather.com/yesterday.php?city=St.+Louis_MO")
content = webpage.read()
(There may be minor mistakes in the above code since I can't remember exactly how each method works.)
Upvotes: 0
Views: 94
Reputation: 52665
You can try below code to get required output:
import requests
from lxml import html
response = requests.get('http://www.fastweather.com/yesterday.php?city=St.+Louis_MO')
source = html.fromstring(response.text)
text_node = source.xpath('//div[@id="content"]//strong[.="Yesterday\'s Precipitation"]/following-sibling::text()[1]')[0]
print(text_node.strip()) # 'was 0.13 inches'
Upvotes: 2