Reputation: 3209
I have to scrape this page using php curl. In this when the user scrolls down , more items are loaded using ajax . Can I call the URL that the ajax script is calling ? If so, then how do I figure the URL out ? I know a bit of ajax, but the code there is kind of complex for me. Here is the relevant js code pastebin
Alternatively can someone suggest an alternative method of scraping that page? PS : I doing this for good cause.
Edit: I figured it out. Live http headers. QUestion can be closed. downvoted to oblivion.
Upvotes: 0
Views: 186
Reputation: 15220
You can use FireBug for that. Switch to the Console-Tab and then make the page make the AJAX-request.
This is what should see after scrolling to the bottom of the page: http://www.flipkart.com/computers/components/ram-20214?_l=m56QC%20tQahyMi46nTirnSA--&_r=11FxOYiYfpMxmANj4kGJzg--&_pop=flyout&response-type=json&inf-start=20
and if you scroll further: http://www.flipkart.com/computers/components/ram-20214?_l=m56QC%20tQahyMi46nTirnSA--&_r=11FxOYiYfpMxmANj4kGJzg--&_pop=flyout&response-type=json&inf-start=40
The tokens seem to always remain the same: _l=m56QC%20tQahyMi46nTirnSA--
and _r=11FxOYiYfpMxmANj4kGJzg--
, so does the _pop
-parameter: _pop=flyout
So let's have a look at the other parameters:
This one was for the main page:
//no additional parameters...
this one for the first 'reload':
&response-type=json&inf-start=20
and this one for the second 'reload':
&response-type=json&inf-start=40
So, appearently you just have to append &response-type=json&inf-start=$offset
to your initial URI to get the results in JSON-format. You can also see the contents in FireBug which should make it very easy to work with them.
Here's a screenshot:
Upvotes: 3