Reputation: 268
I'm trying to scrap some html document using BeautifulSoup4 but I'm stuck trying to scrap this div:
<div class="small-info" style="margin-top: 4px;">
5
<sup>th</sup>
August 2018
</div>
I'm trying to get "5 th August 2018", how can I do that?
Upvotes: 1
Views: 149
Reputation: 3018
You have to use get_text()
and remove extra spaces
html="<div class='small-info' style='margin-top: 4px;''>5<sup>th</sup>August 2018</div>"
soup=BeautifulSoup(html,"lxml")
div=soup.find("div",{"class","small-info"})
text=div.get_text().replace(" ","")
#text : 5 th August 2018
Upvotes: 2