How to use list of RegEx's when defining LxmlLinkExtractor rule

Question

I would like to know how I can define a list of RegEx's outside of my Scrapy spider, and then read the RegEx's into a LxmlLinkExtractor.

I'm using the current code:

file = open("myFile.txt")
regexs = [rule.strip() for rule in file.readlines()]
file.close()
return regexs

The returned value is then passed as a parameter as follows:

Rule(LinkExtractor(allow=(regexs, )), callback='parse_file')

This results in the following error:

TypeError: unhashable type: 'list'

advance512 · Accepted Answer

This should work:

regexs = [rule.strip() for rule in file.readlines()]
LinkExtractor(allow=regexs, callback='parse_file')

Answers (1)