Reputation: 16066
I'm using scrapy and I got to this point where I'd like to extract the text from a list with the following HTML structure:
u'<div id="someId">'
u'<p><strong>Text1:</strong> next to text 1</p>'
u'<p><strong>Text2:</strong> next to text 2</p>'
u'<p><strong>Text3:</strong> next to text </p>'
u'</div>'
so I'd like to get just the text:
Text1: next to text1
Text2: next to text2
Text3: next to text3
I want to extract the text with XPath as much as possible, I've been trying to use some XPath predicates without resolving my issue.
with
response.xpath('//*[@id="someid"]/p/text()').extract()
I don't get the text for the strong tag within P
any help will be more than appreciated.
Upvotes: 1
Views: 1726
Reputation: 18799
you were close:
'//*[@id="someid"]/p//text()'
This will get you a list
with all the text inside that p
tag.
Upvotes: 4