Veign
Veign

Reputation: 606

Scrapy Xpath to get text based on a tag with text in the container

I have this code:

<div class="col-md-12">
    <strong>Ingredients:</strong> TOMATOES (TOMATOES AND FIRE ROASTED TOMATOES, TOMATO JUICE, CITRIC ACID, CALCIUM CHLORIDE), WHITE WINE VINEGAR, CARROTS. <span style="font-style:italic">Date Available</span>: 07/14/2017&nbsp;&nbsp; <span style="font-style:italic">Date Last Updated by Company</span>: 07/14/2017
</div>

What I'm looking to extract is the list of ingredients using Scrapy with Xpath. The only identifying construct is a div that contains

<strong>Ingredients:</strong>

but I can't figure out how to extract the ingredients based on those rules.

Upvotes: 1

Views: 941

Answers (2)

har07
har07

Reputation: 89285

The text you're looking for can be identified as direct following sibling of the strong element, which translates to the following XPath expression :

query = "//div/strong[.='Ingredients:']/following-sibling::text()[1]"

without the predicate [1] the query would have returned 'date available' and 'date last updated' as well.

Upvotes: 2

Samsul Islam
Samsul Islam

Reputation: 2609

if helpful please try it.

response.xpath('//strong[.="Ingredients:"]/following-sibling::text()').extract()

Upvotes: 0

Related Questions