Reputation: 1178
So I'm using python and selenium to scrape the product titles on a sephora page.
url = 'https://www.sephora.com/ca/en/shop/face-makeup'
driver.get(url)
time.sleep(2)
browser = scrollDown(driver, 20)
# this selected the div for every product on the page
products = driver.find_elements_by_class_name('css-79elbk')
for product in products:
title = product.find_elements_by_xpath('/html/body/div[1]/div[2]/div/div/div/div[2]/div[1]/main/div[3]/div/div[1]/div[1]/div[1]/a/div/div[4]/span[2]').text
print(title)
The problem is that when I run it I get Line 48: AttributeError: 'list' object has no attribute 'text'
. The title is in a span that is nested in a div. I've tried this on a normal div with text inside and it scrapes it no problem.
Upvotes: 0
Views: 1630
Reputation: 7563
The error appear because this line:
.find_elements_by_xpath('/html/body/div[1]/div[2]/div/div/div/div[2]/div[1]/main/div[3]/div/div[1]/div[1]/div[1]/a/div/div[4]/span[2]').text
The above return a list.
.text
method utilize to .find_element_*
(without s)
But in simple you can scrape the title by css selector with this value : div[data-comp="ProductDisplayName "] span[data-at="sku_item_name"]
Try following code:
url = 'https://www.sephora.com/ca/en/shop/face-makeup'
driver.get(url)
time.sleep(2)
browser = scrollDown(driver, 20)
titles = driver.find_elements_by_css_selector('div[data-comp="ProductDisplayName "] span[data-at="sku_item_name"]')
for title in titles:
print(title.text)
To scrape the brand, you only change selector to be : div[data-comp="ProductDisplayName "] span[data-at="sku_item_brand"]
Upvotes: 1