Reputation: 23
(disclaimer: I only vaguely know python & am pretty new to coding)
I'm trying to get the text part of a link, but it doesn't have a specific class, and depending on how I word my code I get either way too many things (the xpath wasn't specific enough) or a blank [ ].
A screenshot of what I'm trying to access is :
Tree is all the html from the page. The code that returns a blank is:
cardInfo=tree.xpath('div[@class="cardDetails"]/table/tbody/tr/td[2]/a/text()')
The code that returns way too much:
cardInfo=tree.xpath('a[contains(@href, 'domain_name')]/text()')
I tried going into Inspect in chrome and copying the xpath, which also gave me nothing. I've successfully gotten other things out of the page that are just plain text, not links. Super sorry if I didn't explain this well but does anyone have an idea of what I can write?
Upvotes: 2
Views: 908
Reputation: 369064
If you meant to find text next to Set Name:
:
>>> import lxml.html
>>> tree = lxml.html.parse('http://shop.tcgplayer.com/pokemon/jungle/nidoqueen-7')
>>> tree.xpath(".//b[text()='Set Name:']/parent::td/following-sibling::td/a/text()")
['Jungle']
.//b[text()='Set Name:']
to find b
tag with Set Name:
text,parent::td
- parent td element of it,following-sibling::td
- following td elementUpvotes: 2