Reputation: 1
I am currently working a webscraper which should extract all item's description from a whole category on Amazon. I am writing this script with Python - Selenium - PhantomJS driver. How can I bypass the 400 page limit?
Upvotes: 0
Views: 838
Reputation: 36
Amazon does't offer access to this data in his API. They only have information for "Pro sellers" (not standard sellers) and related to his own sales, shipping or products (you can find information in the Amazon marketplace Feed API page).
The only way I could find to do it is iterate through the category pages. To do it you must start in the page category you're interested, retrieve description, price... and with your webscraper search for an element with Id "pagnNextLink". Then load the next page and repeat the process until you could not find this element.
And remenber that you must iterate this pages one by one (you can't jump to a different page altering the parameter "sr_pg_" in the link), because Amazon include in the links references to the session and this link is generated in every new page.
Upvotes: 1