Borealis
Borealis

Reputation: 8490

How to extract a single element from webpage?

I am looking to extract a single value as text from the following webpage.

Cascade River Rustic Campground

Specifically, I'm after the "4" value after the "No. of Sites" text (see screenshot)

enter image description here

I've been able to isolate the xpath using Chrome, which is as follows:

//*[@id="act_1"]/div[1]/table/tbody/tr/td[2]

The following code yields an empty list:

import urllib2
from lxml import etree

url = "https://www.fs.usda.gov/recarea/superior/recreation/camping-cabins/recarea/?recid=36913&actid=29"

response = urllib2.urlopen(url)
htmlparser = etree.HTMLParser()
tree = etree.parse(response, htmlparser)
x = tree.xpath('//*[@id="act_1"]/div[1]/table/tbody/tr/td[2]')
print x 

The expected output should be:

>>> print x
['4']

How can I extract a single element (i.e. "4") in a web page?

Upvotes: 1

Views: 299

Answers (1)

akuiper
akuiper

Reputation: 215117

It seems this xpath works for me (note there's no tbody) and use text() to extract the text from a node:

x = tree.xpath('//*[@id="act_1"]/div[1]/table/tr/td[2]/text()')

print(x[0].strip())
# 4

Upvotes: 2

Related Questions