Reputation: 45
I'm practicing webscraping by getting basic weather data like the daily high/low temperature from https://www.wunderground.com/ (random zipcode searched).
I've tried various variations of my code but it keeps returning an empty list where the temperature should be. I honestly just don't know enough to pinpoint where i'm going wrong. Can anyone point me in the right direction?
import requests
from bs4 import BeautifulSoup
response=requests.get('https://www.wunderground.com/cgi-bin/findweather/getForecast?query=76502')
response_data = BeautifulSoup(response.content, 'html.parser')
results=response_data.select("strong.high")
I've also tried doing the following along with various other variations:
results = response_data.find_all('strong', class_ = 'high')
results = response_data.select('div.small_6 columns > strong.high' )
Upvotes: 1
Views: 1997
Reputation: 6508
This data you want to parse is being dynamically created by JavaScript, requests
can't handle that. You should use selenium
together with PhantomJS
or any other driver. Below is an example using selenium
and Chromedriver
:
from selenium import webdriver
from bs4 import BeautifulSoup
url='https://www.wunderground.com/cgi-bin/findweather/getForecast?query=76502'
driver = webdriver.Chrome()
driver.get(url)
html = driver.page_source
soup = BeautifulSoup(html, 'html.parser')
Inspecting the elements, the lowest, the highest and the current temperature can be find using:
high = soup.find('strong', {'class':'high'}).text
low = soup.find('strong', {'class':'low'}).text
now = soup.find('span', {'data-variable':'temperature'}).find('span').text
>>> low, high, now
('25', '37', '36.5')
Upvotes: 5