Fiery Phoenix
Fiery Phoenix

Reputation: 1166

Iterating Through Table Rows in Selenium (Python)

I have a webpage with a table that only appears when I click 'Inspect Element' and is not visible through the View Source page. The table contains only two rows with several cells each and looks similar to this:

<table class="datadisplaytable">
<tbody>
<tr>
<td class="dddefault">16759</td>
<td class="dddefault">MATH</td>
<td class="dddefault">123</td>
<td class="dddefault">001</td>
<td class="dddefault">Calculus</td>
<td class="dddefault"></td>
<td class="dddead"></td>
<td class="dddead"></td>
</tr>
<tr>
<td class="dddefault">16449</td>
<td class="dddefault">PHY</td>
<td class="dddefault">456</td>
<td class="dddefault">002</td>
<td class="dddefault">Physics</td>
<td class="dddefault"></td>
<td class="dddead"></td>
<td class="dddead"></td>
</tr>
</tbody>
</table>

What I'm trying to do is to iterate through the rows and return the text contained in each cell. I can't really seem to do it with Selenium. The elements contain no IDs and I'm not sure how else to get them. I'm not very familiar with using xpaths and such.

Here is a debugging attempt that returns a TypeError:

def check_grades(self):
    table = []
    for i in self.driver.find_element_by_class_name("dddefault"):
        table.append(i)
    print(table)

What is an easy way to get the text from the rows?

Upvotes: 19

Views: 67649

Answers (4)

NellieK
NellieK

Reputation: 139

Correction of the Selenium part of @Padraic Cunningham's answer:

table = driver.find_element_by_xpath("//table[@class='datadisplaytable']")

for row in table.find_elements_by_xpath(".//tr"):
    print([td.text for td in row.find_elements_by_xpath(".//td[@class='dddefault']")])

Note: there was one missing round bracket at the end; also removed the [1] index, to match the first XML example.

Another note: Though, the example with the index [1] should also be preserved, to show how to extract individual elements.

Upvotes: 3

Hrvoje
Hrvoje

Reputation: 15152

XPath is fragile. It's better to use CSS selectors or classes:

mytable = find_element_by_css_selector('table.datadisplaytable')
for row in mytable.find_elements_by_css_selector('tr'):
    for cell in row.find_elements_by_tag_name('td'):
        print(cell.text)

Upvotes: 27

Padraic Cunningham
Padraic Cunningham

Reputation: 180391

If you want to go row by row using an xpath, you can use the following:

h  = """<table class="datadisplaytable">
<tr>
<td class="dddefault">16759</td>
<td class="dddefault">MATH</td>
<td class="dddefault">123</td>
<td class="dddefault">001</td>
<td class="dddefault">Calculus</td>
<td class="dddefault"></td>
<td class="dddead"></td>
<td class="dddead"></td>
</tr>
<tr>
<td class="dddefault">16449</td>
<td class="dddefault">PHY</td>
<td class="dddefault">456</td>
<td class="dddefault">002</td>
<td class="dddefault">Physics</td>
<td class="dddefault"></td>
<td class="dddead"></td>
<td class="dddead"></td>
</tr>
</table>"""

from lxml import html
xml = html.fromstring(h)
# gets the table
table =  xml.xpath("//table[@class='datadisplaytable']")[0]


# iterate over all the rows   
for row in table.xpath(".//tr"):
     # get the text from all the td's from each row
    print([td.text for td in row.xpath(".//td[@class='dddefault'][text()])

Which outputs:

['16759', 'MATH', '123', '001', 'Calculus']
['16449', 'PHY', '456', '002', 'Physics']

Using td[text()] will avoid getting any Nones returned for the td's that hold no text.

So to do the same using selenium you would:

table =  driver.find_element_by_xpath("//table[@class='datadisplaytable']")

for row in table.find_elements_by_xpath(".//tr"):
    print([td.text for td in row.find_elements_by_xpath(".//td[@class='dddefault'][1]"])

For multiple tables:

def get_row_data(table):
   for row in table.find_elements_by_xpath(".//tr"):
        yield [td.text for td in row.find_elements_by_xpath(".//td[@class='dddefault'][text()]"])


for table in driver.find_elements_by_xpath("//table[@class='datadisplaytable']"):
    for data in get_row_data(table):
        # use the data

Upvotes: 19

user1457821
user1457821

Reputation: 33

Another Version (modified and corrected post by Padraic Cunningham): Tested with Python 3.x

#!/usr/bin/python

h  = """<table class="datadisplaytable">
<tr>
<td class="dddefault">16759</td>
<td class="dddefault">MATH</td>
<td class="dddefault">123</td>
<td class="dddefault">001</td>
<td class="dddefault">Calculus</td>
<td class="dddefault"></td>
<td class="dddead"></td>
<td class="dddead"></td>
</tr>
<tr>
<td class="dddefault">16449</td>
<td class="dddefault">PHY</td>
<td class="dddefault">456</td>
<td class="dddefault">002</td>
<td class="dddefault">Physics</td>
<td class="dddefault"></td>
<td class="dddead"></td>
<td class="dddead"></td>
</tr>
</table>"""

from lxml import html
xml = html.fromstring(h)
# gets the table
table =  xml.xpath("//table[@class='datadisplaytable']")[0]


# iterate over all the rows   
for row in table.xpath(".//tr"):
     # get the text from all the td's from each row
    print([td.text for td in row.xpath(".//td[@class='dddefault']")])

Upvotes: 1

Related Questions