Reputation: 35
I am trying to do some web scraping on a surf report website using BeautifulSoup, but the returned html does not appear to match the html when viewed in a browser, meaning I can't scrape the data that I am looking for. I am trying to scrape from the following website on the "quiver-surf-height" class, which contains the local surf height estimate. https://www.surfline.com/surf-report/paradise-beach/584204214e65fad6a7709cc1
import requests
from bs4 import BeautifulSoup
url = "https://www.surfline.com/surf-report/paradise-beach/584204214e65fad6a7709cc1"
res = requests.get(url)
soup = BeautifulSoup(res.text,"lxml")
print(soup.select(".quiver-surf-height"))
The print statement returns an empty list. Reading through the returned html I found a statement "Please turn JavaScript on and reload the page." I'm following the steps laid out in a class, so I'm not sure how to handle this response. Any input is appreciated!
Upvotes: 0
Views: 277
Reputation: 20052
As mentioned in the comments, the data you're after is generated dynamically, however, there's an API you can query to get what you want.
All you need it the surf spot id
and how much of days-worth data you want. By default it comes for the last 16 days in an 1-hour intervals. But you can change these params too.
For example, this gets last two days of surf height data served per every hour.
import datetime
import requests
surf_sopt_id = "584204214e65fad6a7709cc1"
days = "2"
api_url = f"https://services.surfline.com/kbyg/spots/forecasts/wave?spotId={surf_sopt_id}&days={days}&intervalHours=1"
headers = {
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36",
}
data = requests.get(api_url, headers=headers).json()
for day in data["data"]["wave"]:
_time = (
datetime
.datetime
.fromtimestamp(day['timestamp'])
.strftime('%Y-%m-%d %H:%M:%S')
)
print(f"{_time}")
surf = day["surf"]
print(f"Surf: {surf['min']} - {surf['max']}")
print(f"{surf['humanRelation']}")
Output:
2022-09-25 06:00:00
Surf: 0.9 - 1.4
Waist to shoulder
2022-09-25 07:00:00
Surf: 0.9 - 1.4
Waist to shoulder
2022-09-25 08:00:00
Surf: 0.9 - 1.4
Waist to shoulder
2022-09-25 09:00:00
Surf: 0.9 - 1.2
Waist to chest
2022-09-25 10:00:00
Surf: 0.9 - 1.2
Waist to chest
2022-09-25 11:00:00
Surf: 0.9 - 1.2
Waist to chest
2022-09-25 12:00:00
Surf: 0.9 - 1.2
Waist to chest
2022-09-25 13:00:00
Surf: 0.9 - 1.2
Waist to chest
2022-09-25 14:00:00
Surf: 0.6 - 1.1
Thigh to stomach
2022-09-25 15:00:00
Surf: 0.6 - 1.1
Thigh to stomach
2022-09-25 16:00:00
and more ...
Upvotes: 1