magexcustomer
magexcustomer

Reputation: 35

Advanced Scrapy use Middleware

I want to developt many middlewares to be sure websites'll be parse. This is the workflow I thinks :

I'll create a custom middleware, with process_request function wich contains all of this 5 methods. But I don't find how save type of connection (for exemple if TOR not works, but direct connection yes, I want to use this settings for all of my other scrap, for the same website). How can I save this settings ?

Other thinks, I've a pipeline wich download images of items. Is there a solution to use this middleware (idealy with saving settings) to use on it ?

Thanks in advance for you're help.

Upvotes: 0

Views: 451

Answers (1)

R. Max
R. Max

Reputation: 6710

I think you could use the retry middleware as a starting point:

  1. You could use request.meta["proxy_method"] to keep track of which one you are using

  2. You could reuse request.meta["retry_times"] in order to track how many times you have retried a given method, and then set the value to zero when you change the proxy method.

  3. You could use request.meta["proxy"] to use the proxy server you want via the existing HTTP proxy middleware. You may want to tweak the middlewares ordering so that the retry middleware runs before the proxy middleware.

Upvotes: 1

Related Questions