Reputation: 23
I'm trying to get the names of the bags on this website- http://www.barneys.com/barneys-new-york/women/bags. So far, I have this code:
from urllib.request import urlopen
from bs4 import BeautifulSoup
url="http://www.barneys.com/barneys-new-york/women/bags"
html = urlopen(url)
bsObj = BeautifulSoup(html.read(),"html.parser")
product_name = bsObj.findAll("a",{"class":"name-link"})
print(product_name)
I tried renderContents() and get_text(), but they are giving me errors (AttributeError).
Upvotes: 2
Views: 2518
Reputation: 180481
The names are in the product-name divs:
from bs4 import BeautifulSoup
import requests
soup = BeautifulSoup(requests.get("http://www.barneys.com/barneys-new-york/women/bags").content)
print([prod.text.strip() for prod in soup.select("div.product-name")])
Which gives you:
['Lizard iPhone® 6 Plus Case', 'Lizard iPhone® 6 Case', 'Peekaboo Large Satchel', 'Embellished Shoulder Bag', 'Rockstud Reversible Tote', 'Hava Shoulder Bag', 'Rockstud Crossbody', 'PS1 Tiny Shoulder Bag', 'Faye Medium Shoulder Bag', 'Rockstud Crossbody', 'Large Shopper Tote', 'Flat Clutch', 'Wicker Small Crossbody', 'Jotty Duffel', 'P.Y.T. Shoulder Bag', 'Hadley Baby Satchel', 'Beckett Small Crossbody', 'Squarit PM Satchel', 'Double Baguette Micro', 'City Victoria Small Satchel', 'Large Zip Pouch', 'Jotty Duffel', 'Jen Small Crossbody', 'Mini Trouble Shoulder Bag', 'Midi Clutch', 'Midi Clutch', 'Two For One Pouch 10', 'Guitar Rockstud Medium Backpack', 'Embellished Large Messenger', 'Papier A4 Side-Zip Tote', 'Nightingale Micro-Satchel', 'Hand-Carved Atlas Clutch', 'Emerald-Cut Minaudière', 'Trouble II Shoulder Bag', 'Intrecciato Olimpia Small Shoulder Bag', 'Rockstud Large Tote', 'Baguette Micro', 'Bindu Small Clutch', 'Emerald-Cut Minaudière', 'Gotham City Hobo', 'Brillant Sellier PM Satchel', 'Flight Weekender Duffel', 'Sac Mesh Bucket Bag', 'Seema Small Satchel', 'Madison Shoulder Bag', 'Sporty Smiley Crossbody', 'Monogram Large Wallet', 'Monogram Card Case']
If you want all the info you can get it from anchor tags with the thumb-link
class inside the div with the id primary:
print(soup.select("#primary a.thumb-link"))
Which gives you output like:
<a class="thumb-link" href="http://www.barneys.com/vianel-lizard-iphone%C2%AE-6-plus-case-504475332.html" title="Lizard iPhone® 6 Plus Case">
<img alt="Vianel Lizard iPhone® 6 Plus Case" class="gridImg" data-image-alter="http://product-images.barneys.com/is/image/Barneys/504475332_2_detail?$grid_new_fixed$" data-original="http://product-images.barneys.com/is/image/Barneys/504475332_1_tabletop?$grid_new_fixed$" height="370" onerror="this.src='http://demandware.edgesuite.net/aasv_prd/on/demandware.static/Sites-BNY-Site/-/default/dwd89468c5/images/browse_placeholder_image.jpg'" title="Lizard iPhone® 6 Plus Case" width="231"/>
<noscript>
<img alt="Vianel Lizard iPhone® 6 Plus Case" src="http://product-images.barneys.com/is/image/Barneys/504475332_1_tabletop?$grid_new_fixed$" title="Lizard iPhone® 6 Plus Case?$grid_new_fixed$"/>
</noscript>
You can parse the image, titles etc.. from each a returned.
Using you own code you would need to access the .text attribute as above:
product_name = [a.text.strip() for a in bsObj.findAll("a",{"class":"name-link"})]
print(product_name)
Which will give you the same as the first select:
['Lizard iPhone® 6 Plus Case', 'Lizard iPhone® 6 Case', 'Peekaboo Large Satchel', 'Embellished Shoulder Bag', 'Rockstud Reversible Tote', 'Hava Shoulder Bag', 'Rockstud Crossbody', 'PS1 Tiny Shoulder Bag', 'Faye Medium Shoulder Bag', 'Rockstud Crossbody', 'Large Shopper Tote', 'Flat Clutch', 'Wicker Small Crossbody', 'Jotty Duffel', 'P.Y.T. Shoulder Bag', 'Hadley Baby Satchel', 'Beckett Small Crossbody', 'Squarit PM Satchel', 'Double Baguette Micro', 'City Victoria Small Satchel', 'Large Zip Pouch', 'Jotty Duffel', 'Jen Small Crossbody', 'Mini Trouble Shoulder Bag', 'Midi Clutch', 'Midi Clutch', 'Two For One Pouch 10', 'Guitar Rockstud Medium Backpack', 'Embellished Large Messenger', 'Papier A4 Side-Zip Tote', 'Nightingale Micro-Satchel', 'Hand-Carved Atlas Clutch', 'Emerald-Cut Minaudière', 'Trouble II Shoulder Bag', 'Intrecciato Olimpia Small Shoulder Bag', 'Rockstud Large Tote', 'Baguette Micro', 'Bindu Small Clutch', 'Emerald-Cut Minaudière', 'Gotham City Hobo', 'Brillant Sellier PM Satchel', 'Flight Weekender Duffel', 'Sac Mesh Bucket Bag', 'Seema Small Satchel', 'Madison Shoulder Bag', 'Sporty Smiley Crossbody', 'Monogram Large Wallet', 'Monogram Card Case']
Upvotes: 1