Pixelartist
Pixelartist

Reputation: 378

Scrapy - 301 redirect in shell

I can not find a solution to the following problem. I am using Scrapy (latest version) and am trying to debug a spider. Using scrapy shell https://jigsaw.w3.org/HTTP/300/301.html -> it does not follow the redirect ( it is using a default spider to get the data). If I am running my spider it follows the 301 - but I can not debug.

How can you make the shell to follow the 301 to allow one to debug the final page?

Upvotes: 4

Views: 2086

Answers (1)

Granitosaurus
Granitosaurus

Reputation: 21406

Scrapy uses Redirect Middleware for redirects, however it's not enabled in shell. Quick fix for this:

scrapy shell "https://jigsaw.w3.org/HTTP/300/301.html"
fetch(response.headers['Location'])

Also to debug your spider you probably want to inspect the response your spider is receiving:

from scrapy.shell import inspect_response
def parse(self, response)
    inspect_response(response, self)
    # the spider will stop here and open up an interactive shell during the run

Upvotes: 10

Related Questions