Can't get text immediately after tag using BeautifulSoup

Question

Currently, in my code I break down a larger soup to get all the 'td' tags with this code:

floorplans_all = sub_soup.findAll('td', {"data-label":"Rent"})
floorplan_soup = soup(floorplans_all[0].prettify(), "html.parser")
rent_span = floorplan_soup.findAll('span', {"class":"sr-only"})

print(floorplans_all)

and end up with the following:


    
      Monthly Rent
     
     $2,335 -
     
      to
     
     $5,269

Printing rent_span looks like this:

  [
  Monthly Rent
 , 
  to
 ]

I can't seem to get "$2,335 -" and "$5,269" from above. I have been trying to walk down the HTML tree, but I'm not able to get the text between the tags.

poke · Accepted Answer

The td element has five children:

A text node containing only whitespace
A span node containing “Monthly Rent”
A text node containing “$2,335 -”
A span node containing “to”
A text node containing “$5,269”

You can iterate those children by using the children attribute:

soup = BeautifulSoup(text, 'html.parser')

for child in soup.td.children:
    print(repr(child))

'
'

      Monthly Rent
     
'
     $2,335 -
     '

      to
     
'
     $5,269
    '

If you want to explicitly look for the text nodes, you could search for the span nodes and get the next sibling each time:

>>> [span.next_sibling.string.strip() for span in soup.td.find_all(class_='sr-only')]
['$2,335 -', '$5,269']

Can't get text immediately after </span> tag using BeautifulSoup

Answers (2)

Related Questions

Can&#39;t get text immediately after &lt;/span&gt; tag using BeautifulSoup

Answers (2)

Related Questions

Can't get text immediately after </span> tag using BeautifulSoup