Damdam31
Damdam31

Reputation: 35

Nothing return in prompt when Scraping Product data using BS4 and Request Python3

hope you're all doing good. I am trying to scrape a specific product from https://footlocker.fr in order to get product's data such as sizes available. The thing is each time i try to run my script nothing returns. Thank you in advance!

import requests
from bs4 import BeautifulSoup
url = 'https://www.footlocker.fr/fr/p/jordan-1-mid-bebe-chaussures-69677?v=316161155904'

page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')

name_box = soup.find_all('div', attrs={'class':'fl-product-details--headline'})
size = soup.find_all('div', attrs={'class':'fl-size-316161155904-UE-21'})

for product in zip(name_box,size):
    name,price=product
    name_proper=name.text.strip()
    size_proper=size.text.strip()
    print(name_proper,'-',price_proper)```

Upvotes: 0

Views: 458

Answers (2)

bogster
bogster

Reputation: 36

  • name_box is empty because you search for <div> and the element that contains the class fl-product-details--headline is a <h1>
  • size is empty because, as @Sri pointed out, there are some AJAX requests that insert that information in the page after the first request

Upvotes: 0

Sri
Sri

Reputation: 2328

Okay. So I found a solution, but it is far from ideal. It is for the following link https://www.footlocker.fr/fr/p/jordan-1-mid-bebe-chaussures-69677?v=316160178204. If you look at the resulting html in page.content, you will obviously notice that the size details are not there. If you read through it a bit, you will see a bunch of references to AJAX leading me to believe it is making an AJAX call and pulling the information in, then parsing it. (This is expected behaviour as stock of items can change over time).

There are two ways to get your data.

  1. You know the URL you are trying to fetch data from. The value after v= is the SKU of the product. For example, if the SKU is 316160178204 you can directly make a request to https://www.footlocker.fr/INTERSHOP/web/FLE/Footlocker-Footlocker_FR-Site/fr_FR/-/EUR/ViewProduct-ProductVariationSelect?BaseSKU=316160178204&InventoryServerity=ProductDetail

  2. For each URL you request, you have to locate the following DIV with class f1-load-animation, then get the data-ajaxcontent-url attribute. Now if you get the data-ajaxcontent-url attribute which is https://www.footlocker.fr/INTERSHOP/web/FLE/Footlocker-Footlocker_FR-Site/fr_FR/-/EUR/ViewProduct-ProductVariationSelect?BaseSKU=316160178204&InventoryServerity=ProductDetail

Now you make a request to this new URL you have, and somewhere in that JSON, you will see values such as

<button class=\"fl-product-size--item fl-product-size--item__not-available\" type=\"button\"\n\n>\n<span>20</span>\n</button>
<button class=\"fl-product-size--item\" type=\"button\"\n\ndata-form-field-target=\"SKU\"\ndata-form-field-base-css-name=\"fl-product-size--item\"\ndata-form-field-value=\"316160178204050\"\ndata-form-field-unselect-group\n\ndata-testid=\"fl-size-316160178204-UE-21\"\ndata-product-size-select-item=\"316160178204050\"\n\n>\n<span>21</span>\n</button>

You will have to parse this snippet of data (I think you can use BeautifulSoup for it). You can see that it has a class of f1-product-size--item__not-available if it is not available, and the size value is in the span.

Upvotes: 1

Related Questions