Reputation: 606
I have this code:
<div class="col-md-12">
<strong>Ingredients:</strong> TOMATOES (TOMATOES AND FIRE ROASTED TOMATOES, TOMATO JUICE, CITRIC ACID, CALCIUM CHLORIDE), WHITE WINE VINEGAR, CARROTS. <span style="font-style:italic">Date Available</span>: 07/14/2017 <span style="font-style:italic">Date Last Updated by Company</span>: 07/14/2017
</div>
What I'm looking to extract is the list of ingredients using Scrapy with Xpath. The only identifying construct is a div that contains
<strong>Ingredients:</strong>
but I can't figure out how to extract the ingredients based on those rules.
Upvotes: 1
Views: 941
Reputation: 89285
The text you're looking for can be identified as direct following sibling of the strong
element, which translates to the following XPath expression :
query = "//div/strong[.='Ingredients:']/following-sibling::text()[1]"
without the predicate [1]
the query would have returned 'date available' and 'date last updated' as well.
Upvotes: 2
Reputation: 2609
if helpful please try it.
response.xpath('//strong[.="Ingredients:"]/following-sibling::text()').extract()
Upvotes: 0