Reputation: 4372
<table>
<tbody>
<tr>
<td>Some Content </td>
<td>Some Content </td>
</tr>
<tr>
<td>Some Content </td>
<td>Some Content </td>
</tr>
<tr>
<table>
<tbody>
<tr>
<td>Some Content </td>
<td>Some Content </td>
</tr>
<tr>
<td>Some Content </td>
<td>Some Content </td>
</tr>
<tr>
<td>Some Content </td>
<td>Some Content </td>
</tr>
</tbody>
</table>
</tr>
</tbody>
<table>
I have some HTML and has table content like above. In side a table there are some more tables.
when I read tr
using beautifulsoup
like
table_grid_1 = soup.find("table", {"id": "GridView1"})
rows = table_grid_1.find("tbody").find_all("tr")
rows of inner table also getting read.
i.e `print "length " + str(len(rows))` prints 5. but I want to read tr of only outer table like size should be 3
How can I read rows of only outer table?
Upvotes: 2
Views: 336
Reputation: 20709
You can achieve this by using the recursive=False
parameter as follows:
soup = BeautifulSoup(html)
table_grid_1 = soup.find("table", {"id": "GridView1"})
rows = table_grid_1.find("tbody").find_all("tr",recursive=False)
print len(rows)
which returns 3.
Upvotes: 2
Reputation: 926
You may try:
[x.string for x in soup.select('table > tbody > tr > td') if x not in soup.select('table > tbody > tr > table > tbody > tr > td')]
Result: only outer table td content. Note: it will return empty list if outer and inner td equal.
Upvotes: 0