user604803
user604803

Reputation: 35

Python XPath returns empty list?

I am trying to scrape information from this webpage.
Here is a screenshot should the webpage not work.


I am trying to print the text in the first <span> element.

I copied the XPath as provided by the Inspect Element view in Google Chrome (see screenshot above), and
//*[@id="main"]/div[1]/div/div/div[2]/p[1]/span[1]/text() was copied to my clipboard.


This is the code I tried:

from lxml import html
import requests

# get alert info
page = requests.get( 'https://www.msn.com/en-us/weather/weatheralerts/Beverly%20Hills,California,Unite%20d%20States/we-city?weadegreetype=F&day=1&ocid=ansmsnweather')
tree = html.fromstring(page.content)

alertInfo = tree.xpath( '//*[@id="main"]/div[1]/div/div/div[2]/p[1]/span[1]/text()')

print alertInfo



However, all I got as an output was []. I am sure that the URL string is correct. Why did this occur?

I also tried alertInfo = tree.xpath( '//span/text()') to see if I could just pick out the element in the list, but even that returned an empty list.

Thanks.

Upvotes: 0

Views: 520

Answers (1)

akiva
akiva

Reputation: 2737

  1. the problem is not with your xpath but rather with the way naughty msn replies to script requests. you may try to trick it as if you're a rel browser hence
  2. if all you're looking is weather reports, I strongly advise getting away from parsing html pages (which is highly vulnerable to page structure changes) some services have great API for example accuweather or yahoo! weather

Upvotes: 1

Related Questions