Reputation: 13
I'm learning a bit of web scraping and I'm having trouble accessing to the list I want to go address.
I tried with:
print(container.div.div)
None
Process finished with exit code 0
print(container.div)
<div class="item-badges">
</div>
Process finished with exit code 0
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
my_url = 'https://www.newegg.com/Video-Cards-Video-Devices/Category/ID-38?Tpk=Graphics%20card'
# Opening connection, grabbing the page
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
# HTML parsing
page_soup = soup(page_html, "html.parser")
# Grabs each product
containers = page_soup.findAll("div", {"class": "item-container"})
container = containers[0]
print(container.a)
By printing container a
It takes me to "item-img"
<a class="item-img" href="https://www.newegg.com/gigabyte-geforce-rtx-2070-super-gv-n207sgaming-oc-8gc/p/N82E16814932171?Item=N82E16814932171">
<div class="item-badges">
</div>
<img alt="GIGABYTE GeForce RTX 2070 Super GAMING OC 8G Graphics Card, 3 x WINDFORCE Fans, 8GB 256-Bit GDDR6, GV-N207SGAMING OC-8GC Video Card" class=" lazy-img" data-effect="fadeIn" data-src="//c1.neweggimages.com/NeweggImage/ProductImageCompressAll300/14-932-171-V09.jpg" src="//c1.neweggimages.com/WebResource/Themes/2005/Nest/blank.gif" title="GIGABYTE GeForce RTX 2070 Super GAMING OC 8G Graphics Card, 3 x WINDFORCE Fans, 8GB 256-Bit GDDR6, GV-N207SGAMING OC-8GC Video Card">
</img></a>
Process finished with exit code 0
So if I use print(container.div)
the div redirects me directly to class="item-badges"
but I want to go to div class="item-branding"
What would be a good way to go to "item-branding"
?
Here is the xml code:
<!--product image-->
<a class="item-img" href="https://www.newegg.com/gigabyte-geforce-rtx-2070-super-gv-n207sgaming-oc-8gc/p/N82E16814932171?Item=N82
E16814932171">
<div class="item-badges">
</div>
<img alt="GIGABYTE GeForce RTX 2070 Super GAMING OC 8G Graphics Card, 3 x WINDFORCE Fans, 8GB 256-Bit GDDR6, GV-N207SGAMING OC-8G
C Video Card" class=" lazy-img" data-effect="fadeIn" data-src="//c1.neweggimages.com/NeweggImage/ProductImageCompressAll300/14-93
2-171-V09.jpg" src="//c1.neweggimages.com/WebResource/Themes/2005/Nest/blank.gif" title="GIGABYTE GeForce RTX 2070 Super GAMING O
C 8G Graphics Card, 3 x WINDFORCE Fans, 8GB 256-Bit GDDR6, GV-N207SGAMING OC-8GC Video Card">
</img>
</a>
<div class="item-info">
<!--brand info-->
<div class="item-branding">
<a class="item-brand" href="https://www.newegg.com/GIGABYTE/BrandStore/ID-1314">
<img alt="GIGABYTE" class=" lazy-img" data-effect="fadeIn" data-src="//c1.neweggimages.com/Brandimage_70x28//Brand1314.gif" src="
//c1.neweggimages.com/WebResource/Themes/2005/Nest/blank.gif" title="GIGABYTE">
</img></a>
<!--rating info-->
<a class="item-rating" href="https://www.newegg.com/gigabyte-geforce-rtx-2070-super-gv-n207sgaming-oc-8gc/p/N82E16814932171?Item=
N82E16814932171&SortField=0&SummaryType=0&PageSize=10&SelectedRating=-1&VideoOnlyMark=False&IsFeedbackTab
=true#scrollFullInfo" title="Rating + 4"><i class="rating rating-4"></i><span class="item-rating-num">(6)</span></a>
</div>
<!--description info-->
<a class="item-title" href="https://www.newegg.com/gigabyte-geforce-rtx-2070-super-gv-n207sgaming-oc-8gc/p/N82E16814932171?Item=N
82E16814932171" title="View Details"><i class="icon-premier icon-premier-xsm"></i>GIGABYTE GeForce RTX 2070 Super GAMING OC 8G Gr
aphics Card, GV-N207SGAMING OC-8GC</a>
<!--promption info-->
<p class="item-promo"><i class="item-promo-icon"></i>Get Control + Wolfenstein: Youngblood w/ purchase, limited offer</p>
<!--feature-->
<ul class="item-features">
<li><strong>Core Clock:</strong> 1815 MHz</li>
<li><strong>Max Resolution:</strong> 7680 x 4320 @ 60 Hz</li>
<li><strong>DisplayPor
t:</strong> 3 x DisplayPort 1.4</li>
<li><strong>HDMI:</strong> 1 x HDMI 2.0b</li>
<li><strong>Model #: </strong>GV-N207SGAMINGOC-8GC</li>
<li><strong>Item #: </strong>N82E16814932171</li>
</ul>
<div class="item-action">
<!--price-->
<ul class="price has-label-membership ">
<li class="price-was">
</li>
<li class="price-map">
</li>
<li class="price-current">
<span class="price-current-label">
<a aria-label="Premier Price Explaination" class="membership-info membership-popup" data-neg-popid="MembershipPopup" href="javasc
ript:void(0);" name="membership" style="display: inline"><span class="membership-icon"></span><span style="display: none">|</span></a>
</span>$<strong>549</strong><sup>.99</sup> <a class="price-current-num" href="https://www.newegg.com/gigabyte-geforce-rtx-2070-su
per-gv-n207sgaming-oc-8gc/p/N82E16814932171?Item=N82E16814932171&buyingoptions=New">(2 Offers)</a>
<span class="price-current-range">
<abbr title="to">–</abbr>
</span>
</li>
<li class="price-save ">
<span class="price-save-endtime price-save-endtime-current"></span>
<span class="price-save-endtime price-save-endtime-another" style="display:none;"></span>
</li>
<li class="price-note">
</li>
<li class="price-ship">
Free Shipping
</li>
</ul>
<!--egg point-->
<!--financing-->
<!--button-->
<div class="item-operate hidden-action-button ">
<div class="item-button-area">
<button class="btn btn-mini " onclick="Javascript:Biz.ProductList.Item.add('https://www.newegg.com/gigabyte-geforce-rtx-2070-supe
r-gv-n207sgaming-oc-8gc/p/N82E16814932171?Item=N82E16814932171');" title="View Details" type="button">View Details <i class="fa f
a-caret-right"></i></button>
</div>
<!--compare-->
<div class="item-compare-box">
<label class="form-checkbox">
<input autocomplete="off" name="CompareItem" neg-itemnumber="14-932-171" type="checkbox" value="CompareItem_14-932-171" />
<span class="form-checkbox-title">Compare</span>
</label>
</div>
<script type="text/javascript">
Biz.Product.CompareConfig.compareItems.push("14-932-171");
var itemThumbs = new Object();
itemThumbs.itemNumber = "14-932-171";
itemThumbs.imageUrl = "//c1.neweggimages.com/ProductImageCompressAll35/14-932-171-V09.jpg";
Biz.Product.CompareConfig.Thumbs.push(itemThumbs);
</script>
</div>
</div>
</div>
</div>
Here is a screenshot of the website: https://i.sstatic.net/ba6M0.jpg
This is the tutorial Tutorial: https://www.youtube.com/watch?v=XQgXKtPSzUI&list=PLLL2cLj8OKvr1bZ6pN3okdka9OPb42cth&index=7&t=0s
Upvotes: 0
Views: 46
Reputation: 6950
You can perform a find for that particular div from the container.
item_branding_div = container.find('div', {'class': 'item-branding'})
print(item_branding_div)
Upvotes: 2