Reputation:
Im struggling with scraping a few pages ... it happens when the structure of the page implies a lot of nested divs... Here is the code page:
<div>
<section class="ui-accordion-header ui-state-default ui-corner-all ui-accordion-icons" role="tab" id="ui-id-1" aria-controls="ui-id-2" aria-selected="false" aria-expanded="false" tabindex="0"><span class="ui-accordion-header-icon ui-icon ui-icon-triangle-1-e"></span>
<div class="detail-avocat">
<div class="nom-avocat">Me <span class="avocat_name">NAME </span></div>
<div class="type-avocat">Avocat postulant au Tribunal Judiciaire</div>
</div>
<div class="more-info">Plus d'informations</div>
</section>
<div class="ui-accordion-content ui-helper-reset ui-widget-content ui-corner-bottom" style="display: none;" id="ui-id-2" aria-labelledby="ui-id-1" role="tabpanel" aria-hidden="true">
<div class="details">
<div class="detail-avocat-row ">
<div class="detail-avocat-content overflow-h">
<span>Structure :</span>
<div>
<p>Cabinet individuel NAME</p>
</div>
</div>
</div>
<div class="detail-avocat-row ">
<div class="detail-avocat-content overflow-h">
<span>Adresse :</span>
<div>
<p>21 rue Belle Isle 57000 VILLE</p>
</div>
</div>
</div>
<div class="detail-avocat-row ">
<div class="detail-avocat-content overflow-h">
<span>Mail :</span>
<div>
<p>[email protected]</p>
</div>
</div>
</div>
<div class="detail-avocat-row">
<div class="detail-avocat-content overflow-h">
<span>Tél :</span>
<div>
<p>Telnum</p>
</div>
</div>
</div>
<div class="detail-avocat-row">
<div class="detail-avocat-content overflow-h">
<span>Fax :</span>
<div>
<p> </p>
</div>
</div>
</div>
<div class="contact-avocat"> <a href="mailto:[email protected]">Contacter</a> </div>
</div>
</div>
</div>
And here is my python code:
divtel = self.driver.find_elements(by=By.XPATH,
value=f'//div[@class="detail-avocat-content overflow-h"]/div/p')#div[@class="detail-avocat-content overflow-h"]')
for p in divtel:
print(p.text)
It doesnt print anything...with other similar pages it prints the text but in this case it doesnt altough there is text in the nested span and div/p . Do you know why?
How can i resolve my problem please? thank you
Upvotes: 1
Views: 79
Reputation: 3547
The method .text
works only when the webelement containing the text is visible in the webpage. If otherwise the webelement is hidden, you have to use .get_attribute('innerText')
or .get_attribute('textContent')
or .get_attribute('innerHTML')
(see here for difference between them). So for example change
print(p.text)
to
print(p.get_attribute('innerText'))
Upvotes: 0