JesseC
JesseC

Reputation: 35

Beautiful Soup can't find this html

Python3 - Beautiful Soup 4

I'm trying to parse the weather graph out of the website: https://www.wunderground.com/forecast/us/ny/new-york-city

But when I grab the weather graph html but beautiful soup seems to grab all around it.

I am new to Beautiful Soup. I think it is not able to grab this because either it is not able to parse the tag thing they have going on or because the javascript that populates the graph hasn't loaded or is not parsable by BS (at least the way I'm using it).

As far as my code goes, it's extremely basic

import requests, bs4
url = 'https://www.wunderground.com/forecast/us/ny/new-york-city'
requrl = requests.get(url, headers={'user-agent': 'Mozilla/5.0'})
requrl.raise_for_status()
bs = bs4.BeautifulSoup(requrl.text, features="html.parser")
a = str(bs)
x = 'weather-graph'
print(a[a.find('x'):])
#Also tried a.find('weather-graph') which returns -1

I have verified that each piece of the code works in other scenarios. The last line should find that string and print out everything after that.

I tried making x many different pieces of the html in and around the graph but got nothing of substance.

Upvotes: 3

Views: 115

Answers (1)

QHarr
QHarr

Reputation: 84465

There is an API you can use. Same as the page does. Don't know if key expires. You may need to do some ordering on output but you can do that by datetime field

import requests
r = requests.get('https://api.weather.com/v1/geocode/40.765/-73.981/forecast/hourly/240hour.json?apiKey=6532d6454b8aa370768e63d6ba5a832e&units=e').json()
for i in r['forecasts']:
    print(i)

If unsure I will happily update to show you how to build dataframe and order.

Upvotes: 1

Related Questions