yasin kutuk
yasin kutuk

Reputation: 75

Select texts between tags in Xpath in Python

<td width="250">
10.03.1984 16:30
<br/>
Lütfi Kırdar, İstanbul
<br/>
<br/>
47-38, 49-58, 8-10
</td>

I want to get all text between "td" tags. My code is mactarih=tree.xpath("//tr//td[@width='250']//text()") . But it is wrong.

The expected result is: text=['10.03.1984 16:30','Lütfi Kırdar, İstanbul','47-38, 49-58, 8-10']

Upvotes: 2

Views: 53

Answers (1)

har07
har07

Reputation: 89285

"My code is mactarih=tree.xpath("//tr//td[@width='250']//text()"). But it is wrong".

If it was 'wrong' in the sense that it returned empty texts or newlines along with the correct texts, then you can use normalize-space() to filter out whitespace-only texts :

mactarih=tree.xpath("//tr//td[@width='250']//text()[normalize-space()]")

Quick test :

>>> from lxml import etree
>>> raw = '''<td width="250">
... 10.03.1984 16:30
... <br/>
... Lütfi Kırdar, İstanbul
... <br/>
... <br/>
... 47-38, 49-58, 8-10
... </td>'''
>>> root = etree.fromstring(raw)
>>> root.xpath("//td[@width='250']//text()[normalize-space()]")
['\n10.03.1984 16:30\n', u'\nL\xfctfi K\u0131rdar, \u0130stanbul\n', '\n47-38, 49-58, 8-10\n']

Upvotes: 2

Related Questions