mrbot
mrbot

Reputation: 63

selenium scraping returns empty string after first few elements

I am scraping a website using selenium in python. The xpath is able to find the 20 elements, which contain the search results. However, the content is available only for the first 6 elements, and the rest has empty strings. This is true for all the pages of the results

The xpath used:

results = driver.find_elements_by_xpath("//li[contains(@class, 'search-result search-result__occluded-item ember-view')]")

xpath finds 20 elements in chrome

enter image description here

Text inside the results

[tt.text for tt in results]

anonymized output:

['Abcddwedwada',
 'Asefdasdfaca',
 'Asdaafcascac',
 'Asdadaacjkhi',
 'Sfskjfbsfvbkd',
 'Fjsbfksjnsvas',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '']

I have tried extracting the id of the 20 elements and used driver.find_element_by_id, but still I get empty strings after the first 6 elements.

Upvotes: 3

Views: 724

Answers (2)

Andersson
Andersson

Reputation: 52685

I can assume that the reason of such result is following: when you opens the page there are 20 entries (<li> elements in <ul>), but only content of 6 displayed. Content of other elements could be displayed after scrolling down - content of those 14 entries generated dynamically from XHR requests.

So you might need to perform scrolling down to the last element in list:

from selenium.webdriver.support.ui import WebDriverWait as wait 

wait(driver, 10).until(lambda x: len(driver.find_elements_by_xpath("//li[contains(@class, 'search-result search-result__occluded-item ember-view') and not(text()='')]")) == 20)
results = driver.find_elements_by_xpath("//li[contains(@class, 'search-result search-result__occluded-item ember-view')]")
results[-1].location_once_scrolled_into_view
[tt.text for tt in results]

Try and let me know results

Upvotes: 1

Chanda Korat
Chanda Korat

Reputation: 2561

Try this ,

[str(tt.text) for tt in results if str(tt.text) !='']

OR

 [tt.text for tt in results if len(tt.text) > 0]

Upvotes: 1

Related Questions