Reputation: 4021
I was trying out the random forest classifier algorithm and when using weights while fitting the trees, I get this error:
rf = RandomForestClassifier(n_estimators=10, n_jobs=1)
rf.fit(train, target, my_weights)
CONSOLE:
line 86, in _parallel_build_trees
curr_sample_weight = sample_weight.copy()
AttributeError: 'list' object has no attribute 'copy'
What am I doing wrong?
dataset = genfromtxt(open('data/training_edited.csv','r'), delimiter=',',dtype=float)[1:]
print("Reading training.csv")
target = [x[32] for x in dataset]
my_weights = [x[31] for x in dataset]
train = [x[1:31] for x in dataset]
Upvotes: 1
Views: 479
Reputation: 14377
You need to provide np.ndarray
s as input, at least for sample_weight
, but ideally for all input:
Change what you have to
import numpy as np
target = np.array([x[32] for x in dataset]) # dataset[:, 32]
my_weights = np.array([x[31] for x in dataset]) # dataset[:, 31]
train = np.array([x[1:31] for x in dataset]) # dataset[:, 1:31]
This can probably even be done more elegantly, since dataset
will itself (probably) already be an array (if file homogeneous), see commented code as suggestions.
Upvotes: 4
Reputation: 8702
instead of curr_sample_weight = sample_weight.copy()
curr_sample_weight = sample_weight[:]
to copy list try above line
use copy.copy():
import copy
curr_sample_weight = copy.copy(sample_weight)
Upvotes: -2