Reputation: 55
does someone knows how to grab the text from the first td and not the next? And make it 0 if it doesn't have a value:
<tr>
<td style="width:28%;">
2 plantas··
</td>
<td style="width:28%;">
300m² terreno
</td>
</tr>
In the picture above, my code (below) is also grabbing the next td which is a blank space, but I want to grab the one that says "300m2 terreno":
terreno=tree.xpath('//td[contains(text(),"planta")]/following-sibling::td/text()')
terreno2=[item.strip() for item in terreno]
terreno3=[]
for casa in terreno2:
if len(casa)<1: continue
terreno3.append(float(casa.split('m²')[0]))
And I'm getiing for output this:
['300m² terreno', '', '', '', '', '', '315m² terreno', '', '', '', '', ''....]
Here is the link from my source: https://www.avisosdeocasion.com/Resultados-Inmuebles.aspx?n=venta-casas-nuevo-leon&PlazaBusqueda=2&Plaza=2
Upvotes: 0
Views: 486
Reputation: 185053
Using this xpath :
//td[contains(text(),"planta")]/following-sibling::td[1]/text()
# ^
limit to the fisrt 'td'
Upvotes: 1