Reputation: 59
Using selenium webdriver and python I am able to locate the search cell and search to return results however I want to print the results from the first 10 rows returned (minus the title row).
The site I am using is: http://www.hoovers.com/company-information/company-search.html?term=simon for example as a search term.
I have been searching for a while and have tried many things including xpaths and most error out. This is the closest I've come so far:
for row in mydriver.find_elements_by_class_name('cmp-company-directory'):
cell = row.find_elements_by_tag_name("td")[0]
print(cell.text)
However it only returns the first row and will not iterate through the table. Any tips? TIA!
Upvotes: 1
Views: 1305
Reputation: 193348
To print the Company Names excluding the title row you have to induce WebDriverWait for the visibility_of_all_elements_located
and you can use either of the following solutions:
CSS_SELECTOR
:
print([company_name.get_attribute("innerHTML") for company_name in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.cmp-company-directory table td.company_name>a")))])
XPATH
:
print([company_name.get_attribute("innerHTML") for company_name in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='cmp-company-directory']//table//td[@class='company_name']/a")))])
To print the first 10 Company Names excluding the title row you have to induce WebDriverWait for the visibility_of_all_elements_located
and then you have to use [:10]
to limit the list to 10 elements and you can use either of the following solutions:
CSS_SELECTOR
:
print([company_name.text for company_name in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.cmp-company-directory table td.company_name>a")))[:10]])
XPATH
:
print([company_name.text for company_name in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='cmp-company-directory']//table//td[@class='company_name']/a")))[:10]])
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Upvotes: 0
Reputation: 33384
Try this below Xpath it will traverse through table and print first 10 rows.
elements=driver.find_elements_by_xpath("//div[@class='clear data-table sortable-header dashed-table-tr alternate-rows']//tr/td")
counter=1
for element in elements:
print(element.text)
counter+=1
if counter==50:
break
OutPut:
Simon Property Group, Inc.
Indianapolis, IN, United States
$5538.64M
See Details
SIMON & SCHUSTER (UK) LIMITED
London, London, England
$60.39M
See Details
SIMON JERSEY GROUP LIMITED
Accrington, Lancashire, England
See Details
Simon Worldwide, Inc.
Irvine, CA, United States
$0.0M
See Details
Simon Property Group, L.P.
Indianapolis, IN, United States
$5538.64M
See Details
Günter Simon e.K. Inh. Carmen Simon
Ravensburg, Baden-Württemberg, Germany
See Details
Simon e Simon Servicos Odontologicos Ltda
Vere, Parana, Brazil
See Details
Simon Comercial e Industrial Ltda Em Recuperacao Judicial
Aparecida De Goiania, Goias, Brazil
See Details
Simon Levelt B.V.
Haarlem, Noord-Holland, The Netherlands
See Details
SIMON SAU
Barcelona, Barcelona, Spain
$115.95M
See Details
If you want to print only first 10 rows of company name try this.
elements=driver.find_elements_by_xpath("//div[@class='clear data-table sortable-header dashed-table-tr alternate-rows']//tr/td[@class='company_name']")
counter=0
for element in elements:
print(element.text)
counter+=1
if counter==10:
break
OutPut:-
Simon Property Group, Inc.
SIMON & SCHUSTER (UK) LIMITED
SIMON JERSEY GROUP LIMITED
Simon Worldwide, Inc.
Simon Property Group, L.P.
Günter Simon e.K. Inh. Carmen Simon
Simon e Simon Servicos Odontologicos Ltda
Simon Comercial e Industrial Ltda Em Recuperacao Judicial
Simon Levelt B.V.
Let me know if this work for you.
Upvotes: 1