Scrapy don't follow links to images

Question

Is there a way in Scrapy to not follow tags pointing to images?

For example:

My code at the moment:

for a in set(response.xpath('//a/@href')):
    yield scrapy.Request(url, callback=self.parse)

Obviously I can add a hard coded check but was wondering if there is a built in option?

Guillaume · Accepted Answer

Use a LinkExtractor, by default it filters out the common image / video / audio / file extensions.

Look here to see the ignored extensions.

Answers (1)