Reputation: 35
hope you're all doing good. I am trying to scrape a specific product from https://footlocker.fr in order to get product's data such as sizes available. The thing is each time i try to run my script nothing returns. Thank you in advance!
import requests
from bs4 import BeautifulSoup
url = 'https://www.footlocker.fr/fr/p/jordan-1-mid-bebe-chaussures-69677?v=316161155904'
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
name_box = soup.find_all('div', attrs={'class':'fl-product-details--headline'})
size = soup.find_all('div', attrs={'class':'fl-size-316161155904-UE-21'})
for product in zip(name_box,size):
name,price=product
name_proper=name.text.strip()
size_proper=size.text.strip()
print(name_proper,'-',price_proper)```
Upvotes: 0
Views: 458
Reputation: 36
name_box
is empty because you search for <div>
and the element that contains the class fl-product-details--headline
is a <h1>
size
is empty because, as @Sri pointed out, there are some AJAX requests that insert that information in the page after the first requestUpvotes: 0
Reputation: 2328
Okay. So I found a solution, but it is far from ideal. It is for the following link https://www.footlocker.fr/fr/p/jordan-1-mid-bebe-chaussures-69677?v=316160178204. If you look at the resulting html in page.content, you will obviously notice that the size details are not there. If you read through it a bit, you will see a bunch of references to AJAX leading me to believe it is making an AJAX call and pulling the information in, then parsing it. (This is expected behaviour as stock of items can change over time).
There are two ways to get your data.
You know the URL you are trying to fetch data from. The value after v=
is the SKU of the product. For example, if the SKU is 316160178204 you can directly make a request to https://www.footlocker.fr/INTERSHOP/web/FLE/Footlocker-Footlocker_FR-Site/fr_FR/-/EUR/ViewProduct-ProductVariationSelect?BaseSKU=316160178204&InventoryServerity=ProductDetail
For each URL you request, you have to locate the following DIV
with class f1-load-animation
, then get the data-ajaxcontent-url
attribute. Now if you get the data-ajaxcontent-url
attribute which is https://www.footlocker.fr/INTERSHOP/web/FLE/Footlocker-Footlocker_FR-Site/fr_FR/-/EUR/ViewProduct-ProductVariationSelect?BaseSKU=316160178204&InventoryServerity=ProductDetail
Now you make a request to this new URL you have, and somewhere in that JSON, you will see values such as
<button class=\"fl-product-size--item fl-product-size--item__not-available\" type=\"button\"\n\n>\n<span>20</span>\n</button>
<button class=\"fl-product-size--item\" type=\"button\"\n\ndata-form-field-target=\"SKU\"\ndata-form-field-base-css-name=\"fl-product-size--item\"\ndata-form-field-value=\"316160178204050\"\ndata-form-field-unselect-group\n\ndata-testid=\"fl-size-316160178204-UE-21\"\ndata-product-size-select-item=\"316160178204050\"\n\n>\n<span>21</span>\n</button>
You will have to parse this snippet of data (I think you can use BeautifulSoup for it). You can see that it has a class of f1-product-size--item__not-available
if it is not available, and the size value is in the span
.
Upvotes: 1