Reputation: 983
I have successfully produced predictions on the data set below, but I am trying to figure out how I can map prediction outputs from the model back to the TEAM labels. I am using Python 3, Pandas and SciKit Learn.
sample_data:
Team A B C Score
Red 5 7 15 100
Green 4 8 22 57
Blue 3 8 33 23
Yellow 6 8 44 122
This is an example of the simple linear regression I set up.
#file input
learning = sample_data
#features
feature_cols = ['A','B','C']
#feature harness
X = learning.loc[:, feature_cols]
#target harness
Y = learning.Score
#model fit
model = LinearRegression()
model.fit(X, Y)
# set up model harness for X
Xnew = learning.values
# set up model harness for Y
ynew = model.predict(Xnew)
print(ynew)
Using this, I can produce a prediction array like below:
[108.3970182 181.02527571 230.70598661 120.18243645]
But I am trying to get something that looks like below, so as I feed in new data without SCORE into the model, I can predict SCORE for each team:
[Red:108.3970182 Green:181.02527571 Blue:230.70598661 Yellow:120.18243645]
I am flexible with the format, I just need to match the prediction output to each specific team from the input.
Upvotes: 0
Views: 1273
Reputation: 15608
You can add them on your dataset as a new column.
# new data new_data (assumes DataFrame)
# You don’t have to pass new_data.values to Scikit-learn
# Scikit-learn accepts DataFrame as it is
predictions = model.predict(new_data)
new_data['predictions'] = predictions
print(new_data)
Upvotes: 1
Reputation: 150815
Do you want a new columns in your data:
learning['prediction'] = ynew
or do you want a dictionary:
d = {k:v for k,v in zip(learning['Team'], ynew)}
Upvotes: 1