Mmelo
Mmelo

Reputation: 41

Web scraping when url doesnt change

I'm doing web scraping for a profile seller of Amazon like this one: https://www.amazon.es/sp?_encoding=UTF8&asin=B07KS22WVT&isAmazonFulfilled=1&isCBA=&marketplaceID=A1RKKUPIHCS9HS&orderID=&seller=A1KD8FXP0BE5W2&tab=&vasStoreID=

I'm using PHP and Goutte. The thing is that in the comment section, when I clik on "Siguiente" (Next) the url doesn't change, and I cant scrape the next comments.

I saw that Goutte supports "click on link" issue. I tried:

$link = $crawler->selectLink('Siguiente')->link();
$crawler = $client->click($link);

but it doesnt work. Is there any other solution?

Upvotes: 1

Views: 112

Answers (1)

PtrTon
PtrTon

Reputation: 3835

Goutte can only load pages which are rendered server-side (with php for instance). Anything which changes without a new pageload is probably done with javascript, which is not supported. You could look at this question. It's probably better to use something like phantomjs for crawling pages as a lot of pages depend on javascript.

Upvotes: 1

Related Questions