need_halp
need_halp

Reputation: 115

Python web scraping, using html-requests to find a specific element and extract text

I am using python for webscraping (new to this) and am trying to grab the brand name from a website. It is not visible on the website but I have found the element for it:

<span itemprop="Brand" style="display:none;">Revlon</span>

I want to extract the "Revlon" text in the HTML. I am currently using html requests and have tried grabbing the selector (CSS) and text:

brandname = r.html.find('body > div:nth-child(96) > span:nth-child(2)', first=True).text.strip()

but this returns None and an error. I am not sure how to extract this specifically. Any help would be appreciated.

Upvotes: 0

Views: 4679

Answers (2)

MK Developer
MK Developer

Reputation: 36

try this method .find("span", itemprop="Brand") I think it's work

from bs4 import BeautifulSoup
import requests


urlpage = 'https://www.boots.com/revlon-colorstay-makeup-for-normal-dry-skin-10212694'

page = requests.get(urlpage)
# parse the html using beautiful soup and store in variable 'soup'
soup = BeautifulSoup(page.content, 'html.parser')
print(soup.find("span", itemprop="Brand").text)

Upvotes: 1

Adil kasbaoui
Adil kasbaoui

Reputation: 663

Here is a working solution with Selenium:

from seleniumwire import webdriver
from webdriver_manager.chrome import ChromeDriverManager


driver = webdriver.Chrome(ChromeDriverManager().install())

website = 'https://www.boots.com/revlon-colorstay-makeup-for-normal-dry-skin-10212694'

driver.get(website)

brand_name = driver.find_element_by_xpath('//*[@id="estore_product_title"]/h1')

print('brand name: '+brand_name.text.split(' ')[0])

You can also use beautifulsoup for that:

from bs4 import BeautifulSoup
import requests


urlpage = 'https://www.boots.com/revlon-colorstay-makeup-for-normal-dry-skin-10212694'

# query the website and return the html to the variable 'page'
page = requests.get(urlpage)
# parse the html using beautiful soup and store in variable 'soup'
soup = BeautifulSoup(page.content, 'html.parser')
name = soup.find(id='estore_product_title')
print(name.text.split(' ')[0])

Upvotes: 3

Related Questions