Scrapy timeout on certain sites

Question

On my own machine I tried

> scrapy fetch http://google.com/

or

> scrapy fetch http://stackoverflow.com/

worked perfectly, somehow www.flyertalk.com does not play well with scrapy. I keep getting timeout error (180s):

> scrapy fetch http://www.flyertalk.com/

however curl works fine without a hiccup

> curl -s http://www.flyertalk.com/

Very strange. here is the full dump:

2015-11-20 17:35:07 [scrapy] INFO: Enabled extensions: CloseSpider, TelnetConsole, LogStats, CoreStats, SpiderState
2015-11-20 17:35:07 [scrapy] INFO: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats
2015-11-20 17:35:07 [scrapy] INFO: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware
2015-11-20 17:35:07 [scrapy] INFO: Enabled item pipelines: 
2015-11-20 17:35:07 [scrapy] INFO: Spider opened
2015-11-20 17:35:07 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2015-11-20 17:35:07 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6037
2015-11-20 17:36:07 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2015-11-20 17:37:07 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2015-11-20 17:38:07 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2015-11-20 17:38:07 [scrapy] DEBUG: Retrying  (failed 1 times): User timeout caused connection failure: Getting http://www.flyertalk.com took longer than 180.0 seconds..
2015-11-20 17:39:07 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2015-11-20 17:40:07 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2015-11-20 17:41:07 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2015-11-20 17:41:07 [scrapy] DEBUG: Retrying  (failed 2 times): User timeout caused connection failure: Getting http://www.flyertalk.com took longer than 180.0 seconds..

Scrapy timeout on certain sites

Answers (1)

Related Questions