Reputation: 3
Hey so I need to web scrape this website (don't use beautiful soup) to get the current temperature and I am having trouble. This is what I have so far but I keep getting either a number that isn't the temperature or -1. So any help is greatly appreciated.
def assign4(city_name):
import urllib.request
if city_name == "St. Catharines":
connection = urllib.request.urlopen("https://weather.gc.ca/city/pages/on-107_metric_e.html")
condition = str(connection.read(), "utf-8")
connection.close()
weather_condition = condition.find("Temperature:</dt>")
if weather_condition != -1:
weather_condition_end = condition.find("</dd>",weather_condition)
if weather_condition_end != -1:
weather_start = condition.find("metric-hide",0,weather_condition_end)
if weather_start != -1:
print(f"Weather Conditions in St. Catharines is {weather_start}")
else:
print("'weather_start' not working")
else:
print("'weather_condition_end' not working")
else:
print("'weather_condition' not working")
assign4("St. Catharines")
Upvotes: 0
Views: 132
Reputation: 330
You can simplify your code with lxml
and requests
import requests
from lxml import html
def assign4(city_name):
if city_name == "St.Catharines":
# Get the html page
resp=requests.get("https://weather.gc.ca/city/pages/on-107_metric_e.html")
# Build html tree
html_tree=html.fromstring(resp.text)
# Get temperature
temperature=html_tree.xpath("//dd[@class='mrgn-bttm-0 wxo-metric-hide'][(parent::dl[@class='dl-horizontal wxo-conds-col2'])]//text()")[0].replace("Â", "")
# Print temperature
print(f"Temperature in {city_name} is {temperature}C")
assign4("St.Catharines")
Outputs:
>>> Temperature in St.Catharines is 4.8°C
Upvotes: 0
Reputation: 2237
There should be a space in between St. and Catherines
in the last line. That is where it's wrong.
if city_name == "St. Catharines":
assign4("St.Catharines")
When you are calling the function your are not adding the space.
Upvotes: 1