Tim
Tim

Reputation: 1289

Displaying Scrapy Results to User in Django

I'm VERY new to Python and I'm attempting to integrate Scrapy with Django.

Here is what I'm trying to make happen:

  1. User submits URL to be scraped
  2. URL is scraped
  3. Scraped data is returned in screen to user
  4. User assigns attributes (if necessary) then saves it to database.

What is the best way to accomplish this? I've played with Django Dynamic Scraper, but I think I'm better off maintaining control over Scrapy for this.

Upvotes: 0

Views: 767

Answers (1)

Guy Gavriely
Guy Gavriely

Reputation: 11396

holding on django request while scraping another website may not be the best idea, this flow is better done asynchronously, meaning, release django request and have another process to handle the scrapying, I guess its not an easy thing to achieve for newcomers, but try to bear with me.

flow should look like this:

  1. user submit a request to scrape some data from another website
  2. spider crawl start on a different process than django, user request is released
  3. spider pipelines items to some data store (database)
  4. the user loop on asking for that data, django update the user based on the data inserted to the data store

shooting a scrapy spider can be done by launching it straight from python code, using a tool like celery, also see django and celery, or by launching it in a new process using python's subprocess, or even better, using scrapyd to manage those spiders

Upvotes: 5

Related Questions