Reputation: 23
Currently, in my code I break down a larger soup to get all the 'td' tags with this code:
floorplans_all = sub_soup.findAll('td', {"data-label":"Rent"})
floorplan_soup = soup(floorplans_all[0].prettify(), "html.parser")
rent_span = floorplan_soup.findAll('span', {"class":"sr-only"})
print(floorplans_all)
and end up with the following:
<td data-label="Rent" data-selenium-id="Rent_6">
<span class="sr-only">
Monthly Rent
</span>
$2,335 -
<span class="sr-only">
to
</span>
$5,269
</td>
Printing rent_span looks like this:
[<span class="sr-only">
Monthly Rent
</span>, <span class="sr-only">
to
</span>]
I can't seem to get "$2,335 -" and "$5,269" from above. I have been trying to walk down the HTML tree, but I'm not able to get the text between the tags.
Upvotes: 1
Views: 285
Reputation: 4101
soup = BeautifulSoup(res, 'html.parser')
row = soup.find('td', {'data-label': "Rent"})
for all in row.find_all('span'):
print(all.text.strip())
an output will be look like this
Monthly Rent
$2,335
$5,269
Upvotes: 1
Reputation: 387677
The td
element has five children:
span
node containing “Monthly Rent”span
node containing “to”You can iterate those children by using the children
attribute:
soup = BeautifulSoup(text, 'html.parser')
for child in soup.td.children:
print(repr(child))
'\n'
<span class="sr-only">
Monthly Rent
</span>
'\n $2,335 -\n '
<span class="sr-only">
to
</span>
'\n $5,269\n '
If you want to explicitly look for the text nodes, you could search for the span
nodes and get the next sibling each time:
>>> [span.next_sibling.string.strip() for span in soup.td.find_all(class_='sr-only')]
['$2,335 -', '$5,269']
Upvotes: 3