pedrommuller
pedrommuller

Reputation: 16066

Xpath to select text from a child node and current node at once

I'm using scrapy and I got to this point where I'd like to extract the text from a list with the following HTML structure:

u'<div id="someId">'
u'<p><strong>Text1:</strong> next to text 1</p>'
u'<p><strong>Text2:</strong> next to text 2</p>'
u'<p><strong>Text3:</strong> next to text </p>'
u'</div>'

so I'd like to get just the text:

Text1: next to text1

Text2: next to text2

Text3: next to text3

I want to extract the text with XPath as much as possible, I've been trying to use some XPath predicates without resolving my issue.

with

response.xpath('//*[@id="someid"]/p/text()').extract()

I don't get the text for the strong tag within P

any help will be more than appreciated.

Upvotes: 1

Views: 1726

Answers (1)

eLRuLL
eLRuLL

Reputation: 18799

you were close:

'//*[@id="someid"]/p//text()'

This will get you a list with all the text inside that p tag.

Upvotes: 4

Related Questions