Reputation: 35
Python3 - Beautiful Soup 4
I'm trying to parse the weather graph out of the website: https://www.wunderground.com/forecast/us/ny/new-york-city
But when I grab the weather graph html but beautiful soup seems to grab all around it.
I am new to Beautiful Soup. I think it is not able to grab this because either it is not able to parse the tag thing they have going on or because the javascript that populates the graph hasn't loaded or is not parsable by BS (at least the way I'm using it).
As far as my code goes, it's extremely basic
import requests, bs4
url = 'https://www.wunderground.com/forecast/us/ny/new-york-city'
requrl = requests.get(url, headers={'user-agent': 'Mozilla/5.0'})
requrl.raise_for_status()
bs = bs4.BeautifulSoup(requrl.text, features="html.parser")
a = str(bs)
x = 'weather-graph'
print(a[a.find('x'):])
#Also tried a.find('weather-graph') which returns -1
I have verified that each piece of the code works in other scenarios. The last line should find that string and print out everything after that.
I tried making x many different pieces of the html in and around the graph but got nothing of substance.
Upvotes: 3
Views: 115
Reputation: 84465
There is an API you can use. Same as the page does. Don't know if key expires. You may need to do some ordering on output but you can do that by datetime field
import requests
r = requests.get('https://api.weather.com/v1/geocode/40.765/-73.981/forecast/hourly/240hour.json?apiKey=6532d6454b8aa370768e63d6ba5a832e&units=e').json()
for i in r['forecasts']:
print(i)
If unsure I will happily update to show you how to build dataframe and order.
Upvotes: 1