Ritesh
Ritesh

Reputation: 173

Getting class data from BeautifulSoup

I am trying to get class data from an HTML page using BeautifulSoup. Here is how the data looks like:

    <div class="quoteText">
      &ldquo;I'm selfish, impatient and a little insecure. I make mistakes, I am out of control and at times hard to handle. But if you can't handle me at my worst, then you sure as hell don't deserve me at my best.&rdquo;
  <br>  &#8213;
  <span class="authorOrTitle">
    Marilyn Monroe
  </span>
</div>

I just want the data under the class "quoteText" without the data in the class "authorOrTitle"

The following script returns the name of the author as well.

for div in soup.find('div', {'class': 'quoteText'}):
    print(div)

How can I get the "quoteText" class data without the "authorOrTitle" class data?

Thanks!

Upvotes: 0

Views: 28

Answers (1)

sushanth
sushanth

Reputation: 8302

try this,

from bs4 import BeautifulSoup

sample = """<div class="quoteText">
      &ldquo;I'm selfish, impatient and a little insecure. I make mistakes, I am out of control and at times hard to handle. But if you can't handle me at my worst, then you sure as hell don't deserve me at my best.&rdquo;
  <br>  &#8213;
  <span class="authorOrTitle">
    Marilyn Monroe
  </span>
</div>
"""

soup = BeautifulSoup(sample, "html.parser")

print(soup.find('div', {'class': 'quoteText'}).contents[0].strip())

Upvotes: 1

Related Questions