Reputation: 97
Any idea how to extract 'TEXT TO GRAB' from this piece of markup:
<span class="navigation_page">
<span>
<a itemprop="url" href="http://www.example.com">
<span itemprop="title">LINK</span>
</a>
</span>
<span class="navigation-pipe">></span>
TEXT TO GRAB
</span>
Upvotes: 0
Views: 385
Reputation: 10666
Not ideal:
text_to_grab = response.xpath('//span[@class="navigation-pipe"]/following-sibling::text()[1]').extract_first()
Upvotes: 2
Reputation: 22440
It's not an ideal solution but it should do the trick:
from scrapy import Selector
content="""
<span class="navigation_page">
<span>
<a itemprop="url" href="http://www.example.com">
<span itemprop="title">LINK</span>
</a>
</span>
<span class="navigation-pipe">></span>
TEXT TO GRAB
</span>
"""
sel = Selector(text=content)
item = sel.css(".navigation_page::text")
print(item.extract()[-1].strip())
OR like this:
sel = Selector(text=content)
item = ''.join([' '.join(items.split()) for items in sel.css("span.navigation_page::text").extract()])
print(item)
Output:
TEXT TO GRAB
Upvotes: 1