XPath: Remove space function not working

Question

I am using Scrapy, XPath, and Python to scrape a website. When I get the results, it has . A google search has yielded that I need to use normalize-space() on my XPath. When I do it, see below, it does not work.

item ['runs'] = stats.select((normalize-space('//tr[@class="cell1"]/td[3]/text()')[count])).extract()

I get a "Global name normalize is not defined error.

Any ideas?

zhangyangyu · Accepted Answer

normalize-space is a part of XPath, not Python. So there is no such a function in Python or some other libs. The right usage of it is like this (just for a sample):

stats.select('''//tr[normalize-space(td/text()) = 'User Name']''').extract()

Just for drop the whitespaces of a a string in python, you can use str methods. For example: strip will remove the leading and trailing whitespaces.

>>> '

sample
'.strip()
'sample'

Something like normalize-space:

>>> ' '.join('
s  am  
 ple
'.split())
's am ple'

XPath: Remove space function not working

Answers (1)

Related Questions