Reputation:
I want to get or select data from two different tables with same class.I tried getting it from 'soup.find_all' but formatting the data is getting tough. There are two tables with same class. I need to get only values(not label) from the tables.
TABLE 1:
<div class="bh_collapsible-body" style="display: none;">
<table border="0" cellpadding="2" cellspacing="2" class="prop-list">
<tbody>
<tr>
<td class="item">
<table>
<tbody>
<tr>
<td class="label">Rim Material</td>
<td class="value">Alloy</td>
</tr>
</tbody>
</table>
</td>
<td class="item">
<table>
<tbody>
<tr>
<td class="label">Front Tyre Description</td>
<td class="value">215/55 R16</td>
</tr>
</tbody>
</table>
</td>
</tr>
<tr>
<td class="item">
<table>
<tbody>
<tr>
<td class="label">Front Rim Description</td>
<td class="value">16x7.0</td>
</tr>
</tbody>
</table>
</td>
<td class="item">
<table>
<tbody>
<tr>
<td class="label">Rear Tyre Description</td>
<td class="value">215/55 R16</td>
</tr>
</tbody>
</table>
</td>
</tr>
<tr>
<td class="item">
<table>
<tbody>
<tr>
<td class="label">Rear Rim Description</td>
<td class="value">16x7.0</td>
</tr>
</tbody>
</table>
</td>
<td></td>
</tr>
</tbody>
</table>
</div>
</div>
TABLE 2:
<div class="bh_collapsible-body" style="display: none;">
<table border="0" cellpadding="2" cellspacing="2" class="prop-list">
<tbody>
<tr>
<td class="item">
<table>
<tbody>
<tr>
<td class="label">Steering</td>
<td class="value">Rack and Pinion</td>
</tr>
</tbody>
</table>
</td>
<td></td>
</tr>
</tbody>
</table>
</div>
</div>
What i have tried:
I tried getting the first table contents from Xpath but its giving with both values and labels.
table1 = driver.find_element_by_xpath("//*[@id='features']/div/div[5]/div[2]/div[1]/div[1]/div/div[2]/table/tbody/tr[1]/td[1]/table/tbody/tr/td[2]")
I tried to split the data but not succeeded
Upvotes: 0
Views: 102
Reputation: 195573
I think you are looking for CSS selector tr:not(:has(tr))
, this will select the inner-most <tr>
:
from bs4 import BeautifulSoup
soup = BeautifulSoup(data, 'html.parser') # the variable data contains string for Table1 and Table2 in your question
rows = []
for tr in soup.select('tr:not(:has(tr))'):
rows.append([td.get_text(strip=True) for td in tr.select('td')])
for row in zip(*rows):
print(''.join('{: ^25}'.format(d) for d in row))
Prints:
Rim Material Front Tyre Description Front Rim Description Rear Tyre Description Rear Rim Description Steering
Alloy 215/55 R16 16x7.0 215/55 R16 16x7.0 Rack and Pinion
The variable rows
contains:
[['Rim Material', 'Alloy'],
['Front Tyre Description', '215/55 R16'],
['Front Rim Description', '16x7.0'],
['Rear Tyre Description', '215/55 R16'],
['Rear Rim Description', '16x7.0'],
['Steering', 'Rack and Pinion']]
Further reading:
EDIT: Changed to CSS Selector to tr:not(:has(tr))
Upvotes: 1