Reputation: 173
I am trying to get class data from an HTML page using BeautifulSoup. Here is how the data looks like:
<div class="quoteText">
“I'm selfish, impatient and a little insecure. I make mistakes, I am out of control and at times hard to handle. But if you can't handle me at my worst, then you sure as hell don't deserve me at my best.”
<br> ―
<span class="authorOrTitle">
Marilyn Monroe
</span>
</div>
I just want the data under the class "quoteText" without the data in the class "authorOrTitle"
The following script returns the name of the author as well.
for div in soup.find('div', {'class': 'quoteText'}):
print(div)
How can I get the "quoteText" class data without the "authorOrTitle" class data?
Thanks!
Upvotes: 0
Views: 28
Reputation: 8302
try this,
from bs4 import BeautifulSoup
sample = """<div class="quoteText">
“I'm selfish, impatient and a little insecure. I make mistakes, I am out of control and at times hard to handle. But if you can't handle me at my worst, then you sure as hell don't deserve me at my best.”
<br> ―
<span class="authorOrTitle">
Marilyn Monroe
</span>
</div>
"""
soup = BeautifulSoup(sample, "html.parser")
print(soup.find('div', {'class': 'quoteText'}).contents[0].strip())
Upvotes: 1