Reputation: 127
I am working with sklearn and pandas and my prediction is coming out as an array without the right id, which has been set as the index.
My code:
train = train.set_index('activity_id')
test = test.set_index('activity_id')
y_train = train['outcome']
x_train = train.drop('people_id', axis=1)
x_test = test
model = DecisionTreeClassifier(min_samples_leaf=100)
model.fit(x_train,y_train)
scores = cross_val_score(model, x_train,y_train, cv=10)
print('mean: {:.3f} (std: {:.3f})'.format(scores.mean(), scores.std()), end='\n\n')
print(model.score(x_train,y_train))
#make predictions
y_pred = model.predict(x_test)
Any thoughts on how i can get them to print out with the right activity_id list? Thanks!
Upvotes: 0
Views: 912
Reputation: 4744
From what you have written I believe you are trying to show your index for x_test next to the y_pred values generated by x_test.
This can be done by turning the numpy array output from model.predict(x_test)
into a DataFrame. Then we can set the index of the new DataFrame to be the same as that of x_test
.
Here is an example,
df_pred = pd.DataFrame(y_pred, index=x_test.index, columns=['y_pred'])
Upvotes: 1