ToddS
ToddS

Reputation: 21

Strip values from HTML with beautifulsoup

Trying to strip from

<h3 class="s-item__title s-item__title--has-tags" role="text"><div><div class="s-item__title-tag">Nov 14, 2018</div></div>Text I Want</h3>

I want the values: Nov 14, 2018, Text I Want

I've tried but cannot get to that second value.

Upvotes: 1

Views: 50

Answers (1)

James Dellinger
James Dellinger

Reputation: 1261

I used the strings generator to grab all strings in the html, and store in a list:

from bs4 import BeautifulSoup

html = """<h3 class="s-item__title s-item__title--has-tags" role="text"><div><div class="s-item__title-tag">Nov 14, 2018</div></div>Text I Want</h3>)"""

bs = BeautifulSoup(html, 'html.parser')
text = [s for s in bs.h3.strings]

text

['Nov 14, 2018', 'Text I Want']

Upvotes: 3

Related Questions