Bilal Halayqa
Bilal Halayqa

Reputation: 972

Scrapy CSS Selector ignore tags and get text only

I have the following HTML :

<li class="last">
    <span>SKU:</span> 483151
</li>

I was able to select them using :

SKU_SELECTOR = '.aaa .bbb .last ::text'
sku = response.css(SKU_SELECTOR).extract_first().strip()

How can I get the number only and ignore the span.

Upvotes: 1

Views: 1606

Answers (1)

Granitosaurus
Granitosaurus

Reputation: 21406

Your css selector has unnecessary space before ::text.

SKU_SELECTOR = '.aaa .bbb .last ::text'
                               ^

Space indicates that any decendant-or-self node qualifies for this selector where you want to select only text under self.

I got it working:

>[0]: s = Selector(tex='...')
>[1]: s.css('.last::text').extract()
<[1]: [u'\n    ', u' 483151\n']

Upvotes: 2

Related Questions