gnemoug
gnemoug

Reputation: 467

how can i use scrapy to finish a distributed scraper with celery?

Now i want to complete a distributed scraper with scrapy and celery,my current idea is to use master-slave method,can someone tell me is that a good idea?is there a good open-source project for this?

Upvotes: 1

Views: 797

Answers (1)

When i implemented a distributed crawling set-up.I achieved that with the help of redis. Here is how i did it.

I have a list of domains to be crawled upon.I will upload those domains to redis.In my project, i had 30K domains to scrape data from.

Use redis-py client to talk to redis, and feed the each url to scrapy.

Upvotes: 2

Related Questions