chitown88
chitown88

Reputation: 28640

Beautifulsoup - get text not between specific tags (after </span> but before <br>)?

I've looked around and found solutions that have worked or suppose to work for this exact question, but it will not work for this situation. Anyone have a reason why it would work here, and not here? Or just simply show what I'm doing wrong, and I can work out the difference.

Keep in mind, I'm just giving a snippet of the html, it contains much more with the same span and class='boldText'. I'm specifically wanting the tag with Status: as its text, then the next text/content following that.

import bs4 

html1 = '''<span class="boldText"><b>Date:</b>  </span>12/04/2018<br/>
<span class="boldText"><b>Name:</b>  </span>Aaron Rodgers<br/>
<span class="boldText"><b>Status:</b>  </span>Questionable<br/><br/>
<br/>
<br/><br/><br/>'''

soup = bs4.BeautifulSoup(html1,'html.parser') 
status = soup.find(text='Status:').next_sibling

I'm just trying to get the text: 'Questionable'

so looking for output:

>>> print (status)
>>> Questionable

Upvotes: 0

Views: 42

Answers (1)

cody
cody

Reputation: 11157

The problem is that the b tag has no siblings. It's easier to see when formatted like this:

<span class="boldText">
    <b>Status:</b>
</span>
Questionable
<br/>

See how the b is the only child of the span? The string "Questionable" is actually a sibling of the parent span, so you need to navigate to it as follows:

print(soup.find('b', string='Status:').parent.next_sibling)
# => 'Questionable'

Upvotes: 2

Related Questions