Reputation: 315
I'm using Scrapy for recursive scraping. I wrote my spider so that it could follow the "next page" button and scrape each row in every page. However, my spider only scraped 80% of the rows that I originally expected. I want to view all error messages so that I can know the specific content that my spider failed to scrape. I know that errors are displayed on the Windows cmd as the spider runs, but there are too many lines on the cmd for my naked eyes. In addition, I was only allowed to scroll up to a certain point, which means it's impossible to view all error messages this way. So is there a way to display all error messages? Thanks very much!
Upvotes: 1
Views: 1140
Reputation: 935
There are command-line arguments you can use to log the output:
--logfile FILE
: Overrides LOG_FILE
--loglevel/-L LEVEL
: Overrides LOG_LEVEL
You can use them along with your spider with:
scrapy crawl my_spider --logfile myspider.log
And later just look for the errors in the file.
Upvotes: 3