Scrapy is sending response of type None to my custom file pipeline

Question

Using Scrapy, I want to download & save the files with different filename.

First of all, if I enable the default files pipeline. The files (may be html/pdf) are downloading perfectly fine.

For renaming, I wrote the following class & overriden file_path method.

class MyCustomFilePipeline(FilesPipeline):
    def file_path(self, request, response=None, info=None, *, item=None):        
        # extract from 191148 http://mywebsite.com/filedownload.asp?pn=191148&yr=2022
        pn = re.search(r'(?<=pn\=)\d+', request.url).group()
        print(f'{request.url} - {pn}')
        print(type(response)) # <-- this prints as 
        
        response_contentype = response.headers['Content-Type'].decode('ASCII')
        ext = 'html'
        if response_contentype  == 'text/html':
            ext = 'html'
        elif response_contentype == 'application/pdf':
            ext = 'pdf'
        print(f'{pn}.{ext}') # <-- this is not printed  
        return f'{pn}.{ext}'

I enabled it in settings.py. In the console, for each request URL, I'm getting the output of both the print statements (in the above code, for debugging).

But the response is .

Surprisingly, print(f'{pn}.{ext}') isn't being printed.

No files are begin downloaded. files is not populated

Why isnt the scrapy making requests & getting responses? What am I missing?

Scrapy is sending response of type None to my custom file pipeline

Answers (1)

Related Questions