user3342874
user3342874

Reputation: 21

Scrapy spider not showing whole result

Hi all I an trying to get whole results from the given link in the code. but my code not giving all results. This link says it contain 2132 results but it returns only 20 results.:

from scrapy.spider import Spider
from scrapy.selector import Selector
from tutorial.items import Flipkart
 class Test(Spider):
   name = "flip"
   allowed_domains = ["flipkart.com"]
   start_urls = ["http://www.flipkart.com/mobiles/pr?sid=tyy,4io&  otracker=ch_vn_mobile_filter_Mobile%20Brands_All"
]
  def parse(self, response):
     sel = Selector(response)
     sites = sel.xpath('//div[@class="pu-details lastUnit"]')
     items = []
     for site in sites:
         item = Flipkart()
         item['title'] = site.xpath('div[1]/a/text()').extract()
         items.append(item)
     return items**

Upvotes: 2

Views: 421

Answers (1)

imiric
imiric

Reputation: 9020

That is because the site only shows 20 results at a time, and loading of more results is done with JavaScript when the user scrolls to the bottom of the page.

You have two options here:

  • Find a link on the site which shows all results on a single page (doubtful it exists, but some sites may do so when passed an optional query string, for example).
  • Handle JavaScript events in your spider. The default Scrapy downloader doesn't do this, so you can either analyze the JS code and send the event signals yourself programmatically or use something like Selenium w/ PhantomJS to let the browser deal with it. I'd recommend the latter since it's more fail-proof than the manual approach of interpreting the JS yourself. See this question for more information, and Google around, there's plenty of information on this topic.

Upvotes: 1

Related Questions