BeautifulSoup, select text to extract

Question

I would like to scrape some quotes and authors but haven't found a way to separate the quote from the author during scraping.

import requests
from bs4 import BeautifulSoup

#url = 'https://www.goodreads.com/quotes'
#r = requests.get(url)
#soup = BeautifulSoup(r.content, 'html.parser')

html = """
       “Insanity is doing the same thing, over and over again, but expecting different results.” 
  ―
       Narcotics Anonymous
       
"""

soup = BeautifulSoup(html, 'html.parser')

quotes = soup.find_all('div', {'class': 'quoteText'})

for quote in quotes:
    if quote.text is not None:
        print(quote.text)

Andersson · Accepted Answer

You can try to use stripped_strings property:

for quote in quotes:
    if quote.text is not None:
        strings = [string for string in quote.stripped_strings]
        quote_body = strings[0]
        quote_author = strings[2]
        print(quote_body) 
        print(quote_author)

BeautifulSoup, select text to extract

Answers (2)

Related Questions