Reputation: 154
I've created a script in python to fetch all the asins that are available in a certain node. There are around 1000 asins in there. The way I've tried below can fetch me 146 asins out of 1000. Although the number of pages is changing accordingly when I hit the SHOW MORE
button located at the bottom of that page, I get the exact same asins when I change the page numbers within my script.
I've tried so far with:
import re
import json
import requests
from bs4 import BeautifulSoup
node = '15529609011'
r = requests.get(f'https://www.amazon.com/stores/node/{node}?productGridPageIndex=1')
soup = BeautifulSoup(r.content,'lxml')
slot_num = soup.select_one('.stores-widget-btf')['id']
res = requests.get(f'https://www.amazon.com/stores/slot/{slot_num}?node={node}')
p = re.compile(r'var config = (.*);')
data = json.loads(p.findall(res.text)[0])
asins = data['content']['ASINList']
print(len(asins))
How can I grab all the asins available in there using requests?
Upvotes: 1
Views: 219
Reputation: 598
The data from Show More button is loaded via an ajax requests.
You can either:
selenium
Upvotes: 1