rezale
rezale

Reputation: 45

Webscraping with BeautifulSoup, getting empty list

I'm practicing webscraping by getting basic weather data like the daily high/low temperature from https://www.wunderground.com/ (random zipcode searched).

I've tried various variations of my code but it keeps returning an empty list where the temperature should be. I honestly just don't know enough to pinpoint where i'm going wrong. Can anyone point me in the right direction?

import requests
from bs4 import BeautifulSoup
response=requests.get('https://www.wunderground.com/cgi-bin/findweather/getForecast?query=76502')
response_data = BeautifulSoup(response.content, 'html.parser')
results=response_data.select("strong.high")

I've also tried doing the following along with various other variations:

results = response_data.find_all('strong', class_ = 'high')
results = response_data.select('div.small_6 columns > strong.high' )

Upvotes: 1

Views: 1997

Answers (1)

Vinícius Figueiredo
Vinícius Figueiredo

Reputation: 6508

This data you want to parse is being dynamically created by JavaScript, requests can't handle that. You should use selenium together with PhantomJS or any other driver. Below is an example using selenium and Chromedriver:

from selenium import webdriver
from bs4 import BeautifulSoup

url='https://www.wunderground.com/cgi-bin/findweather/getForecast?query=76502'
driver = webdriver.Chrome()
driver.get(url)
html = driver.page_source

soup = BeautifulSoup(html, 'html.parser')

Inspecting the elements, the lowest, the highest and the current temperature can be find using:

high = soup.find('strong', {'class':'high'}).text
low = soup.find('strong', {'class':'low'}).text
now = soup.find('span', {'data-variable':'temperature'}).find('span').text

>>> low, high, now
('25', '37', '36.5')

Upvotes: 5

Related Questions