Reputation: 63
I am scraping a website using selenium in python. The xpath is able to find the 20 elements, which contain the search results. However, the content is available only for the first 6 elements, and the rest has empty strings. This is true for all the pages of the results
The xpath used:
results = driver.find_elements_by_xpath("//li[contains(@class, 'search-result search-result__occluded-item ember-view')]")
xpath finds 20 elements in chrome
Text inside the results
[tt.text for tt in results]
anonymized output:
['Abcddwedwada',
'Asefdasdfaca',
'Asdaafcascac',
'Asdadaacjkhi',
'Sfskjfbsfvbkd',
'Fjsbfksjnsvas',
'',
'',
'',
'',
'',
'',
'',
'',
'',
'',
'',
'',
'',
'']
I have tried extracting the id of the 20 elements and used driver.find_element_by_id
, but still I get empty strings after the first 6 elements.
Upvotes: 3
Views: 724
Reputation: 52685
I can assume that the reason of such result is following: when you opens the page there are 20 entries (<li>
elements in <ul>
), but only content of 6 displayed. Content of other elements could be displayed after scrolling down - content of those 14 entries generated dynamically from XHR
requests.
So you might need to perform scrolling down to the last element in list:
from selenium.webdriver.support.ui import WebDriverWait as wait
wait(driver, 10).until(lambda x: len(driver.find_elements_by_xpath("//li[contains(@class, 'search-result search-result__occluded-item ember-view') and not(text()='')]")) == 20)
results = driver.find_elements_by_xpath("//li[contains(@class, 'search-result search-result__occluded-item ember-view')]")
results[-1].location_once_scrolled_into_view
[tt.text for tt in results]
Try and let me know results
Upvotes: 1
Reputation: 2561
Try this ,
[str(tt.text) for tt in results if str(tt.text) !='']
OR
[tt.text for tt in results if len(tt.text) > 0]
Upvotes: 1