Scrapy: How to get a correct selector

Question

I would like to select the following text:

Bold normal Italics

I need to select and get: Bold normal italist.

The html is:

Bold normal Italist

However, a/text() yields

normal

only. Does anyone know a fix? I'm testing bing crawling, and the bold text is in different position depending on the query.

Frank Martin · Accepted Answer

You can use a//text() instead of a/text() to get all text items.

# -*- coding: utf-8 -*-
from scrapy.selector import Selector

doc = """
Bold normal Italist
"""

sel = Selector(text=doc, type="html")

result = sel.xpath('//a/text()').extract()
print result
# >>> [u' normal ']

result = u''.join(sel.xpath('//a//text()').extract())
print result
# >>> Bold normal Italist

Scrapy: How to get a correct selector

Answers (2)

Related Questions