Fareed Mabrouk
Fareed Mabrouk

Reputation: 67

How to find a value in a table with no identifiers? (Python, Selenium)

I have a webpage with a table with many rows. A user will give me a number (15308) which can be found in the top line with the first <td> tag, and this is the only information I will have. I want to be able to use this number to find the data between the <th></th> tag (more specifically the 0), but only for the table row. For example, I attached two table rows and I want the <th> data using the number 15308, but not the <th> data from the table row that has the number 15309 in it's first <td>. Any help is appreciated!
Desired Output: 0

<tr>
<td><a href="http://sdb.admin.uw.edu/timeschd/UWNetID/sln.asp?QTRYR=SPR+2019&amp;SLN=15308">15308</a></td>
<td nowrap="">INFO   101  </td>
<td>A </td>
<td align="CENTER">LC</td>
<td>SOCIAL NETWORKING   </td>
<td align="CENTER"> 150</td>
<td align="CENTER"> 150</td>
<td align="CENTER"> 250</td>
<th align="CENTER">  0</th><td align="CENTER"> 229</td>
<td></td>
</tr>
<tr><td><a href="http://sdb.admin.uw.edu/timeschd/UWNetID/sln.asp?QTRYR=SPR+2019&amp;SLN=15309">15309</a></td>
<td nowrap="">INFO   101  </td>
<td>AA</td>
<td align="CENTER">LB</td>
<td>SOCIAL NETWORKING   </td>
<td align="CENTER">  25</td>
<td align="CENTER">  25</td>
<td align="CENTER">  26</td>
<th align="CENTER" style="">  2</th><td align="CENTER">  21</td>
<td></td>
</tr>

Upvotes: 0

Views: 71

Answers (2)

DirtyBit
DirtyBit

Reputation: 16792

Something I have always found beautiful, using beauitfulsoup:

Using the xpath="1" as an attribute:

line = '''<tr><td><a href="http://sdb.admin.uw.edu/timeschd/UWNetID/sln.asp?QTRYR=SPR+2019&amp;SLN=15308" style="">15308</a></td>
<td nowrap="">INFO   101  </td>
<td>A </td>
<td align="CENTER">LC</td>
<td>SOCIAL NETWORKING   </td>
<td align="CENTER"> 150</td>
<td align="CENTER"> 150</td>
<td align="CENTER"> 250</td>
<th align="CENTER" style="" xpath="1">  0</th><td align="CENTER"> 229</td>
<td></td>
</tr>'''

from bs4 import BeautifulSoup

soup = BeautifulSoup(line, 'html.parser')
xpathTh = soup.find('th',  attrs={'xpath': '1'})
print(xpathTh.text.strip())

OUTPUT:

0

EDIT:

To get all the values from the attrib:

line = '''<tr><td><a href="http://sdb.admin.uw.edu/timeschd/UWNetID/sln.asp?QTRYR=SPR+2019&amp;SLN=15308" style="">15308</a></td>
<td nowrap="">INFO   101  </td>
<td>A </td>
<td align="CENTER">LC</td>
<td>SOCIAL NETWORKING   </td>
<td align="CENTER"> 150</td>
<td align="CENTER"> 150</td>
<td align="CENTER"> 250</td>
<th align="CENTER" style="" xpath="1">  0</th><td align="CENTER"> 229</td>
<th align="CENTER" style="" xpath="1">  1</th><td align="CENTER"> 229</td>
<th align="CENTER" style="" xpath="1">  2</th><td align="CENTER"> 229</td>
<td></td>
</tr>'''

from bs4 import BeautifulSoup

soup = BeautifulSoup(line, 'html.parser')
xpathTh = soup.find_all('th',  attrs={'xpath': '1'})

for elem in xpathTh:
    print(elem.text.strip())

OUTPUT:

0
1
2

EDIT 2:

Considering you only want the xpath value if the anchor tag inside the td (inside tr) has a value of 15308:

line = '''<tr><td><a href="http://sdb.admin.uw.edu/timeschd/UWNetID/sln.asp?QTRYR=SPR+2019&amp;SLN=15308" style="">15308</a></td>
<td nowrap="">INFO   101  </td>
<td>A </td>
<td align="CENTER">LC</td>
<td>SOCIAL NETWORKING   </td>
<td align="CENTER"> 150</td>
<td align="CENTER"> 150</td>
<td align="CENTER"> 250</td>
<th align="CENTER" style="" xpath="1">  0</th><td align="CENTER"> 229</td>
<td></td>
</tr>
<tr><td><a href="http://sdb.admin.uw.edu/timeschd/UWNetID/sln.asp?QTRYR=SPR+2019&amp;SLN=2222" style="">22222</a></td>
<td nowrap="">INFO   101  </td>
<td>A </td>
<td align="CENTER">LC</td>
<td>SOCIAL NETWORKING   </td>
<td align="CENTER"> 150</td>
<td align="CENTER"> 150</td>
<td align="CENTER"> 250</td>
<th align="CENTER" style="" xpath="1">  1</th><td align="CENTER"> 229</td>
<td></td>
</tr>'''

from bs4 import BeautifulSoup

soup = BeautifulSoup(line, 'html.parser')

trElems = soup.find_all('tr')
toFind = '15308'

for tr in trElems:
    val = tr.select('td a')[0].text
    if toFind == val:
        xpathTh = tr.find_all('th', attrs={'xpath': '1'})
        for elem in xpathTh:
            print(elem.text.strip())

OUTPUT:

0

EDIT 3:

Continuing from comments:

line = '''<tr>
<td><a href="http://sdb.admin.uw.edu/timeschd/UWNetID/sln.asp?QTRYR=SPR+2019&amp;SLN=15308">15308</a></td>
<td nowrap="">INFO   101  </td>
<td>A </td>
<td align="CENTER">LC</td>
<td>SOCIAL NETWORKING   </td>
<td align="CENTER"> 150</td>
<td align="CENTER"> 150</td>
<td align="CENTER"> 250</td>
<th align="CENTER">  0</th><td align="CENTER"> 229</td>
<td></td>
</tr>
<tr><td><a href="http://sdb.admin.uw.edu/timeschd/UWNetID/sln.asp?QTRYR=SPR+2019&amp;SLN=15309">15309</a></td>
<td nowrap="">INFO   101  </td>
<td>AA</td>
<td align="CENTER">LB</td>
<td>SOCIAL NETWORKING   </td>
<td align="CENTER">  25</td>
<td align="CENTER">  25</td>
<td align="CENTER">  26</td>
<th align="CENTER" style="">  2</th><td align="CENTER">  21</td>
<td></td>
</tr>'''

from bs4 import BeautifulSoup

soup = BeautifulSoup(line, 'html.parser')

trElems = soup.find_all('tr')
toFind = '15308'

for tr in trElems:
    val = tr.select('td a')[0].text
    if toFind == val:
        xpathTh = tr.find_all('td')[7]
        print("For the value: {}, The result is {}".format(toFind, xpathTh.find_next('th').text.strip()))

OUTPUT:

For the value: 15308, The result is 0

Upvotes: 0

Pritam Maske
Pritam Maske

Reputation: 2760

Use Following code :

userValue='15308'
all_td_th_of_row = driver.find_elements_by_xpath("//td[normalize-space()='" + userValue + "']//following-sibling::td|th")
i = 0
while i<len(all_td_th_of_row) : 
    print(all_td_th_of_row[i].text)
    i=i+1

Upvotes: 1

Related Questions