Ken Pen
Ken Pen

Reputation: 15

Python - Extracting Text from a <td class = "text">Need This Text</td>

I am new to selenium and python, so my overall goal is to extract the revenue value for a company from the website Hoovers.

Current code:

company = 'Trelleborg'
page = 'https://hoovers.com/company-information/cs.html?term=' + company
driver.get(page)

r = driver.find_element_by_xpath('//td/font[@class="company_sales"]').text
print(r)

HTML for the Desired Revenue

<td class="company_name">
  <a href="/company-information/cs/company- 
  profile.trelleborg_ab.a545a8005aced58d.html">
  Trelleborg AB</a>
</td>
<td class="company_location">Trelleborg, Skåne, Sweden</td>
<td class="company_sales">$3842.84M</td>

I would like to extract the $3842.84M text into a variable. I have tried many different solutions that I have found online but keep on receiving the NoSuchElementException error message. Any Help would be appreciated!!!

Upvotes: 0

Views: 806

Answers (3)

undetected Selenium
undetected Selenium

Reputation: 193308

To extract and print the text $3842.84M you need to induce WebDriverWait for the desired visibility of element located and you can use the following solution:

  • Code BlocK:

    from selenium import webdriver
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    from selenium.webdriver.common.by import By
    
    company = 'Trelleborg'
    driver = webdriver.Firefox(executable_path=r'C:\Utility\BrowserDrivers\geckodriver.exe')
    page = 'https://hoovers.com/company-information/cs.html?term=' + company
    driver.get(page)
    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='cmp-company-directory']//tbody//td/a[contains(., '"+company +"')]//following::td[2]"))).get_attribute("innerHTML"))
    
  • Console Output:

    $3842.84M
    

Upvotes: 0

KunduK
KunduK

Reputation: 33384

In this case You can find element by class name or CSS Sector or XPath.

If you want to use XPath:

driver.find_element_by_xpath('//td[@class="company_sales"]').text

OR if you want to use CSS Sector:

driver.find_element_by_css_selector("td.company_sales").text

OR

driver.find_element_by_css_selector(".company_sales").text

OR if you want use class name:

driver.find_element_by_class_name("company_sales").text

Good Luck!

Upvotes: 1

Amit Darji
Amit Darji

Reputation: 457

It's look like issue with XPath. Generally Xpath format is like.

Xpath=//tagname[@attribute='value']
  • // : Select current node.
  • Tagname: Tagname of the particular node.
  • @: Select attribute.
  • Attribute: Attribute name of the node.
  • Value: Value of the attribute.

So, Resultant xpath in your case must looks like.

//td[@class="company_sales"]

Upvotes: 0

Related Questions