how to extract description from HTML paragraph using Python

I want to extract HTML paragraph from the HTML source. But it's getting data of color and id along with it.

import requests
from bs4 import BeautifulSoup

url = "https://www.nike.com/gb/t/air-max-viva-shoe-ZQTSV8/DB5268-003"

response = requests.get(url)

soup = BeautifulSoup(response.text, 'lxml')

description = soup.find(
    'div', {'class': 'description-preview body-2 css-1pbvugb'}).text
print(description)

Upvotes: 0

Answers (3)

αԋɱҽԃ αмєяιcαη

Reputation: 11525

if that's your only target from the link, so you don't need to use a real parser in that case, since that's will loads all the content within cache memory.

You can compare the operation time using regex or bs4 parser.

below is a quick catch:

import re
import requests

r = requests.get(
    'https://www.nike.com/gb/t/air-max-viva-shoe-ZQTSV8/DB5268-003')

match = re.search(r'descriptionPreview\":\"(.+?)\"', r.text).group(1)
print(match)

Output:

Designed with every woman in mind, the mixed material upper of the Nike Air Max Viva 
features a plush collar, detailed patterning and intricate stitching. The new lacing 
system uses 2 separate laces constructed from heavy-duty tech chord, letting you find the perfect fit. Mixing comfort with style, it combines Nike Air with a lifted foam 
heel for and unbelievable ride that looks as good as it feels.

In case if you would like to use bs4:

Here's a short usage:

soup = BeautifulSoup(r.text, 'lxml')
print(soup.select_one('.description-preview').p.string)

Note: i used lxml parser as it's the quickest parser according to bs4-documentation

Upvotes: 1

user9706

Reputation:

It seems you want the text of the next <p>:

description = soup.find('div', {'class':'description-preview body-2 css-1pbvugb'}).find_next('p').text

Upvotes: 1

Arundeep Chohan

Reputation: 9969

Just use .find p with after it.

description = soup.find('div', {'class':'description-preview body-2 css-1pbvugb'}).find("p").text

Upvotes: 1

how to extract description from HTML paragraph using Python

Answers (3)

Related Questions