Cesar Augusto
Cesar Augusto

Reputation: 268

Convert to string untagged child Beautiful soup

I'm trying to scrap some html document using BeautifulSoup4 but I'm stuck trying to scrap this div:

<div class="small-info" style="margin-top: 4px;">
                5
                 <sup>th</sup>  
                August 2018
</div>

I'm trying to get "5 th August 2018", how can I do that?

Upvotes: 1

Views: 149

Answers (1)

Sruthi
Sruthi

Reputation: 3018

You have to use get_text() and remove extra spaces

html="<div class='small-info' style='margin-top: 4px;''>5<sup>th</sup>August 2018</div>"
soup=BeautifulSoup(html,"lxml")
div=soup.find("div",{"class","small-info"})
text=div.get_text().replace("  ","")

#text : 5 th August 2018

Upvotes: 2

Related Questions