Reputation: 1481
Is there a way in Scrapy to not follow <a>
tags pointing to images?
For example:
<a href="http://jamsphere.com/wp-content/uploads/2015/11/Franki-Dennull-PROFILE.jpg">
My code at the moment:
for a in set(response.xpath('//a/@href')):
yield scrapy.Request(url, callback=self.parse)
Obviously I can add a hard coded check but was wondering if there is a built in option?
Upvotes: 1
Views: 343
Reputation: 1879
Use a LinkExtractor, by default it filters out the common image / video / audio / file extensions.
Look here to see the ignored extensions.
Upvotes: 2