Reputation: 8490
I am looking to extract a single value as text from the following webpage.
Cascade River Rustic Campground
Specifically, I'm after the "4" value after the "No. of Sites" text (see screenshot)
I've been able to isolate the xpath using Chrome, which is as follows:
//*[@id="act_1"]/div[1]/table/tbody/tr/td[2]
The following code yields an empty list:
import urllib2
from lxml import etree
url = "https://www.fs.usda.gov/recarea/superior/recreation/camping-cabins/recarea/?recid=36913&actid=29"
response = urllib2.urlopen(url)
htmlparser = etree.HTMLParser()
tree = etree.parse(response, htmlparser)
x = tree.xpath('//*[@id="act_1"]/div[1]/table/tbody/tr/td[2]')
print x
The expected output should be:
>>> print x
['4']
How can I extract a single element (i.e. "4") in a web page?
Upvotes: 1
Views: 299
Reputation: 215117
It seems this xpath works for me (note there's no tbody) and use text()
to extract the text from a node:
x = tree.xpath('//*[@id="act_1"]/div[1]/table/tr/td[2]/text()')
print(x[0].strip())
# 4
Upvotes: 2