Reputation: 53
Just trying out the Sklearn python library and I re-purposed some code I was using for Linear regression to fit a regression tree model as an example I saw (here's the example code):
def fit(self, X, y):
"""
Fit a Random Forest model to data `X` and targets `y`.
Parameters
----------
X : array-like
Input values.
y: array-like
Target values.
"""
self.X = X
self.y = y
self.n = self.X.shape[0]
self.model = ExtraTreesRegressor(**self.params)
self.model.fit(X, y)
Here's the code I've written/repurposed
data = pd.read_csv("rmsearch.csv", sep=",")
data = data[["price", "type", "number_bedrooms"]]
predict = "price"
X = np.array(data.drop([predict], 1))
y = np.array(data[predict])
x_train, x_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, test_size=0.2)
etr = ensemble.ExtraTreesRegressor
etr.fit(x_train, y_train)
acc = etr.score(x_test, y_test)
print("Accuracy; ", acc)
and I am getting this error:
etr.fit(x_train, y_train)
TypeError: fit() missing 1 required positional argument: 'y'
I know fit() takes 'X', 'y', and 'sample_weight' as input. but, sample_weight defaults to none. the other examples haven't helped me much but it could also be that I'm fairly new to python and not able to spot a simple coding error.
fit() documentation:
Thanks for your help in advance.
Upvotes: 1
Views: 7625
Reputation: 19310
The problem is here
etr = ensemble.ExtraTreesRegressor
etr.fit(x_train, y_train)
You need to instantiate ensemble.ExtraTreesRegressor
before calling fit
on it. Change this code to
etr = ensemble.ExtraTreesRegressor()
etr.fit(x_train, y_train)
You get the seemingly strange error that y
is missing because .fit
is an instance method, so the first argument to this function is actually self
. When you call .fit
on an instance, self
is passed automatically. If you call .fit
on the class (as opposed to the instance), you would have to supply self
. So your code is equivalent to ensemble.ExtraTreesRegressor.fit(self=x_train, x=y_train)
.
For an example of the difference, please see the example below. The two forms are functionally equivalent, but you can see that the first form is clunky.
from sklearn import ensemble
# Synthetic data.
x = [[0]]
y = [1]
myinstance = ensemble.ExtraTreesRegressor()
ensemble.ExtraTreesRegressor.fit(myinstance, x, y)
etr = ensemble.ExtraTreesRegressor()
etr.fit(x, y)
Upvotes: 2