ihadanny
ihadanny

Reputation: 4483

super __str__ isnt getting called

I'm inheriting from sklearn.ensemble import RandomForestClassifier, and I'm trying to print my new estimator:

class my_rf(RandomForestClassifier):
    def __str__(self):
        return "foo_" + RandomForestClassifier.__str__(self) 

gives foo_my_rf()

I also tried:

class my_rf(RandomForestClassifier):
    def __str__(self):
        return "foo_" + super(RandomForestClassifier, self).__str__() 

with the same result. expected is something pretty like sklearn default behaviour:

>>> a = RandomForestClassifier()
>>> print a
RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
        max_depth=None, max_features='auto', max_leaf_nodes=None,
        min_samples_leaf=1, min_samples_split=2,
        min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=1,
        oob_score=False, random_state=None, verbose=0,
        warm_start=False)
>>>

This is also the result when I use print a.__str__().

What am I missing? Thanks.

related to How do I change the string representation of a Python class?

Upvotes: 0

Views: 310

Answers (1)

Eli Korvigo
Eli Korvigo

Reputation: 10483

In RandomForestClassifier both __repr__ and __str__ lookup the name of the class of the instance they are called from (self). You should directly reference the name of the superclass.

Update This is how you can get your desired output, though I don't get, why would you want something like that. There is a reason why RandomForestClassifier's __str__ and __repr__ return the actual name of a class. That way you can eval to restore the object. Anyway,

In [1]: from sklearn.ensemble import RandomForestClassifier
In [2]: class my_rf(RandomForestClassifier):
    def __str__(self):
        superclass_name = RandomForestClassifier.__name__
        return "foo_" + superclass_name + "(" + RandomForestClassifier.__str__(self).split("(", 1)[1]

In [3]: forest = my_rf()
In [4]: print forest
foo_RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini', max_depth=None,
   max_features='auto', max_leaf_nodes=None, min_samples_leaf=1,
   min_samples_split=2, min_weight_fraction_leaf=0.0, n_estimators=10,
   n_jobs=1, oob_score=False, random_state=None, verbose=0,
   warm_start=False)

Update 2 You get no parameters when you override __init__, because in the superclass __str__ and __repr__ are implemented to scan the list of arguments passed to __init__. You can clearly see it by running this code:

In [5]: class my_rf(RandomForestClassifier):
    def __init__(self, *args, **kwargs):
        RandomForestClassifier.__init__(self, *args, **kwargs)
    def __str__(self):
        superclass_name = RandomForestClassifier.__name__
        return "foo_" + superclass_name + "(" + RandomForestClassifier.__str__(self).split("(", 1)[1]
In [6]: forest = my_rf()
In [7]: print forest
...
RuntimeError: scikit-learn estimators should always specify their parameters in the signature of their __init__ (no varargs). <class '__main__.my_rf'> with constructor (<self>, *args, **kwargs) doesn't  follow this convention.

Upvotes: 0

Related Questions