Zzz
Zzz

Reputation: 439

Python Selenium unable to find table element by xpath

Here is what the table looks like on the web page (it's just one column):

Here is what the table looks like on the web page

Here is the HTML of the table I am trying to scrape:

Here is the HTML of the table I am trying to scrape

If it matters, that table is nested within another table.

Here is my code:

    def filter_changed_records():
        # Scrape webpage for addresses from table of changed properties
        row_number = 0
        results_frame = locate_element(
            '//*[@id="oGridFrame"]'
        )
        driver.switch_to.frame(results_frame)
        while True:
            try:
                address = locate_element("id('row" + str(row_number) +
                                         "FC')/x:td")
                print(address)
                changed_addresses.append(address)
                row_number += 1
            except:
                print("No more addresses to add.")
                break

As you can see, there is a <tr> tag with an id of row0FC. This table is dynamically generated, and each new <tr> gets an id with a increasing number: row0FC, row1FC, row2FC etc. That is how I planned on iterating through all the entries and adding them to a list.

My locate_element function is the following:

    def locate_element(path):
        element = WebDriverWait(driver, 50).until(
            EC.presence_of_element_located((By.XPATH, path)))
        return element

It always times out after 50 seconds from not finding the element. Unsure of how to proceed. Is there a better way of locating the element?

SOLUTION BY ANDERSSON

address = locate_element("//tr[@id='row%sFC']/td" % row_number).text

Upvotes: 3

Views: 1752

Answers (2)

Andersson
Andersson

Reputation: 52665

Your XPath seem to be incorrect.

Try below:

address = locate_element("//tr[@id='row%sFC']/td" % row_number)

Also note that address is a WebElement. If you want to get its text content, you should use

address = locate_element("//tr[@id='row%sFC']/td" % row_number).text

Upvotes: 3

jlaur
jlaur

Reputation: 740

Parsing html with selenium is slow. I would use BeautifulSoup for that.

Suppose you have loaded the page in driver it would be something like:

from bs4 import BeautifulSoup
....

soup = BeautifulSoup(driver.page_source, "html.parser")
td_list = soup.findAll('td')
for td in td_list:
    try:
        addr = td['title']
        print(addr)
    except:
        pass

Upvotes: -1

Related Questions