Oliver
Oliver

Reputation: 37

How to get only specified item from list of "tr" tags

I'm trying to get only one item from a list of tags:

Here is a sample of the code, simplified :

<table>
 <tbody>
  <tr>Intro</tr>
  <tr>
   <td>First data</td>
   <td>Second data</td>
   <td>Third data</td>
   <td>Fourth data</td>
  </tr>
  <tr>
   <td>First data</td>
   <td>Second data</td>
   <td>Third data</td>
   <td>Fourth data</td>
  </tr>
 </tbody>
</table>

I'm interested only in "Second data" and "Third data"

Here's what I have:

table = soup.find_all("table")

children_tr = table[3].findChildren("tr")

for child_td in children_tr[1:]:
  for child in child_td:
    try:
      print(child)
    except AttributeError:
      print("")

Like this I can get in order

First data
Second data
Third data
Fourth data

I can't figure out how to get only "First data" and "Second data"

Edit: there are multiple from which I need the first and second

Upvotes: 2

Views: 229

Answers (3)

MendelG
MendelG

Reputation: 20038

You can use the nth-of-type(n) CSS selector.

from bs4 import BeautifulSoup


html = """
<table>
 <tbody>
  <tr>Intro</tr>
  <tr>
   <td>First data</td>
   <td>Second data</td>
   <td>Third data</td>
   <td>Fourth data</td>
  </tr>
  <tr>
   <td>First data</td>
   <td>Second data</td>
   <td>Third data</td>
   <td>Fourth data</td>
  </tr>
 </tbody>
</table>
"""

soup = BeautifulSoup(html, "html.parser")

for tag in soup.select("tr td:nth-of-type(2), td:nth-of-type(3)"):
    print(tag.text)

Output:

Second data
Third data
Second data
Third data

Upvotes: 1

Oliver
Oliver

Reputation: 37

I worked out this solution:

for child_td in children_tr[1:]:
  for child in child_td:
    try:
      temp_list.insert(0,child.text)
    except AttributeError:
      continue
  max_temperature.append(temp_list[3])
print(max_temperature)

Like this I end up with a list of the "Second data"

Not the most elegant solution, if any of you know of a better way please let me know

Upvotes: 0

Ashok Arora
Ashok Arora

Reputation: 541

I can't figure out how to get only "First data" and "Second data"

table = soup.findAll('td')[:2] 
for t in table: 
    print(t.text)

Upvotes: 0

Related Questions