Anders Zhou
Anders Zhou

Reputation: 99

Scraping text; I'm not sure the Google Chrome Inspect element is giving me the correct XPath. Where can I get the correct path?

Here, I want to scrape a website called "fundsnetservices.com." Specifically, I want to grab the text below each program — it's about a paragraph's worth of text.

Using the Google Chrome Inspect method, I was able to pull this...

'/html/body/div[3]/div/div/div[1]/div/p[2]/text()'

... as the xpath. However, every time I print the text out, it returns [ ]. Why might this be?

response = urllib.request.urlopen('http://www.fundsnetservices.com/searchresult/30/International-Grants-&-Funders/18.html')
tree = etree.HTML(response.read().decode('utf-16'))
text = tree.xpath('/html/body/div[3]/div/div/div[1]/div/p[2]/text()')

Upvotes: 0

Views: 135

Answers (1)

E.Wiest
E.Wiest

Reputation: 5905

It seems your code returns whitespace nodes. Correct your XPath with :

//p[@class="tdclass"]/text()[3]

Upvotes: 1

Related Questions