user2728494
user2728494

Reputation: 131

exceptions.TypeError: __init__() takes exactly 1 argument (3 given)

I'm creating a custom filter by inheriting RFPDupeFilter. Here is the link from which I'm using the code: https://github.com/j4s0nh4ck/wiki-spider/blob/master/wiki/wiki/SeenURLFilter.py

Note: I have above code in a custom file named custom_filters.py in the same directory where settings.py resides then in settings.py I have this code.

DUPEFILTER_CLASS = 'myspider.custom_filters.SeenURLFilter'

But when I run the bot, I get this error:

exceptions.TypeError: __init__() takes exactly 1 argument (3 given)

Upvotes: 1

Views: 673

Answers (1)

alecxe
alecxe

Reputation: 474001

As you can see in the traceback from_settings() method of your filter is called - it then creates an instance of your custom dupe filter. But, since you don't specify your own from_settings() method the one from built-in RFPDupeFilter is used:

@classmethod
def from_settings(cls, settings):
    debug = settings.getbool('DUPEFILTER_DEBUG')
    return cls(job_dir(settings), debug)

which tries to instantiate your custom dupe filter with path and debug constructor arguments. And your SeenURLFilter constructor does not accept debug argument.

You need to have your dupefilter accepting debug parameter as well:

from scrapy.dupefilter import RFPDupeFilter

class SeenURLFilter(RFPDupeFilter):
    """A dupe filter that considers the URL"""

    def __init__(self, path=None, debug=False):  # FIX WAS APPLIED HERE
        self.urls_seen = set()
        RFPDupeFilter.__init__(self, path, debug)  # AND HERE

    def request_seen(self, request):
        if request.url in self.urls_seen:
            return True
        else:
            self.urls_seen.add(request.url)

Upvotes: 1

Related Questions