Scrapy file download using custom path based on item

Question

what I would like to do is pretty basic I think but I couldn't find a way to implement it.

I am trying to use the FilesPipeline in scrapy in order to download a file (ex. Image1.jpg) and save it on a path relative to the item which placed that request in the first place (ex. item.name).

It is pretty similar with this question here, though I want to pass as an argument the item.name or item.something field, in order to save each file in a custom path depending on the item.name.

The path is defined in the persist_file function, but that function does not have access to the item itself, just the file request and response.

def get_media_requests(self, item, info):
    return [Request(x) for x in item.get(self.FILES_URLS_FIELD, [])]
I can also see above, that the request is made here in order to process the files into the pipeline, but is there a way to pass an extra argument in order to later use it on the file_downloaded and afterwards on persist_file function?

As a final solution, it would be pretty simple to rename/move the file after it has been downloaded in one of the following pipelines but it seems sloppy, isn't it?

I am using the code implemented here as a custom pipeline.

Can anyone help please? Thank you in advance :)

Scrapy file download using custom path based on item

Answers (1)

Related Questions