user11670046
user11670046

Reputation:

Get the table(only values) from two different tables?

I want to get or select data from two different tables with same class.I tried getting it from 'soup.find_all' but formatting the data is getting tough. There are two tables with same class. I need to get only values(not label) from the tables.

TABLE 1:

<div class="bh_collapsible-body" style="display: none;">
  <table border="0" cellpadding="2" cellspacing="2" class="prop-list">
    <tbody>
    <tr>
      <td class="item">
        <table>
          <tbody>
          <tr>
            <td class="label">Rim Material</td>
            <td class="value">Alloy</td>
          </tr>
          </tbody>
        </table>
      </td>
      <td class="item">
        <table>
          <tbody>
          <tr>
            <td class="label">Front Tyre Description</td>
            <td class="value">215/55 R16</td>
          </tr>
          </tbody>
        </table>
      </td>
    </tr>
    <tr>
      <td class="item">
        <table>
          <tbody>
          <tr>
            <td class="label">Front Rim Description</td>
            <td class="value">16x7.0</td>
          </tr>
          </tbody>
        </table>
      </td>
      <td class="item">
        <table>
          <tbody>
          <tr>
            <td class="label">Rear Tyre Description</td>
            <td class="value">215/55 R16</td>
          </tr>
          </tbody>
        </table>
      </td>
    </tr>
    <tr>
      <td class="item">
        <table>
          <tbody>
          <tr>
            <td class="label">Rear Rim Description</td>
            <td class="value">16x7.0</td>
          </tr>
          </tbody>
        </table>
      </td>
      <td></td>
    </tr>
    </tbody>
  </table>
</div>
</div>

TABLE 2:

<div class="bh_collapsible-body" style="display: none;">
  <table border="0" cellpadding="2" cellspacing="2" class="prop-list">
    <tbody>
    <tr>
      <td class="item">
        <table>
          <tbody>
          <tr>
            <td class="label">Steering</td>
            <td class="value">Rack and Pinion</td>
          </tr>
          </tbody>
        </table>
      </td>
      <td></td>
    </tr>
    </tbody>
  </table>
</div>
</div>

What i have tried:

I tried getting the first table contents from Xpath but its giving with both values and labels.

table1 = driver.find_element_by_xpath("//*[@id='features']/div/div[5]/div[2]/div[1]/div[1]/div/div[2]/table/tbody/tr[1]/td[1]/table/tbody/tr/td[2]")

I tried to split the data but not succeeded

Upvotes: 0

Views: 102

Answers (1)

Andrej Kesely
Andrej Kesely

Reputation: 195573

I think you are looking for CSS selector tr:not(:has(tr)), this will select the inner-most <tr>:

from bs4 import BeautifulSoup

soup = BeautifulSoup(data, 'html.parser') # the variable data contains string for Table1 and Table2 in your question

rows = []
for tr in soup.select('tr:not(:has(tr))'):
    rows.append([td.get_text(strip=True) for td in tr.select('td')])

for row in zip(*rows):
    print(''.join('{: ^25}'.format(d) for d in row))

Prints:

  Rim Material        Front Tyre Description    Front Rim Description    Rear Tyre Description    Rear Rim Description           Steering         
      Alloy                 215/55 R16                 16x7.0                 215/55 R16                 16x7.0               Rack and Pinion     

The variable rows contains:

[['Rim Material', 'Alloy'],
 ['Front Tyre Description', '215/55 R16'],
 ['Front Rim Description', '16x7.0'],
 ['Rear Tyre Description', '215/55 R16'],
 ['Rear Rim Description', '16x7.0'],
 ['Steering', 'Rack and Pinion']]

Further reading:

CSS Selectors Reference

EDIT: Changed to CSS Selector to tr:not(:has(tr))

Upvotes: 1

Related Questions