Emanuele Bellucci
Emanuele Bellucci

Reputation: 58

Extract an object's description through Beautifulsoup in python

I want to extract the description near the figure (the one that goes from "Figurine model" to "Stay Tuned :)") and store it into the variable information through BeautifulSoup. How can I do it? Here's my code, but I don't know how to continue it:

from bs4 import BeautifulSoup
response = requests.get('https://www.myminifactory.com/object/3d-print-the-little-prince-4707')
soup = BeautifulSoup(response.text, "lxml")
information = 

I show you below the page from where I want to extract the object's description. Thank you in advance! The Page from where I want to extract the text

Upvotes: 0

Views: 478

Answers (2)

KC.
KC.

Reputation: 3107

Find the parent tag then looking for <p>, fliter the spaces and ____

parent = soup.find("div",class_="row container-info-obj margin-t-10")
result = [" ".join(p.text.split()) for p in parent.find_all("p") if p.text.strip() and not "_"*8  in p.text]
#youtube_v = parent.find("iframe")["src"]
print(result)

Upvotes: 1

Bhanu Tez
Bhanu Tez

Reputation: 306

This works for me, not proud of the script because of the way I used the break statement. But the script works.

from urllib.request import urlopen
from bs4 import BeautifulSoup as BS

url = r'https://www.myminifactory.com/object/3d-print-the-little-prince-4707'

html = urlopen(url).read()
Soup = BS(html,"lxml")
Desc = Soup.find('div',{'class':'short-text text-auto-link'}).text
description = ''
for line in Desc.split('\n'):
    if line.strip() == '_________________________________________________________________________':
        break
    if line.strip():
        description += line.strip()
print(description)

Upvotes: 2

Related Questions