hansolo
hansolo

Reputation: 983

Include Labels from SciKit Learn Prediction

I have successfully produced predictions on the data set below, but I am trying to figure out how I can map prediction outputs from the model back to the TEAM labels. I am using Python 3, Pandas and SciKit Learn.

sample_data:

Team    A   B    C  Score
Red     5   7   15  100
Green   4   8   22  57
Blue    3   8   33  23
Yellow  6   8   44  122 

This is an example of the simple linear regression I set up.

#file input
learning = sample_data

#features
feature_cols = ['A','B','C']


#feature harness
X = learning.loc[:, feature_cols]

#target harness
Y = learning.Score

#model fit
model = LinearRegression()
model.fit(X, Y)

# set up model harness for X
Xnew = learning.values

# set up model harness for Y
ynew = model.predict(Xnew)

print(ynew)

Using this, I can produce a prediction array like below:

[108.3970182  181.02527571 230.70598661 120.18243645]

But I am trying to get something that looks like below, so as I feed in new data without SCORE into the model, I can predict SCORE for each team:

[Red:108.3970182  Green:181.02527571 Blue:230.70598661 Yellow:120.18243645]

I am flexible with the format, I just need to match the prediction output to each specific team from the input.

Upvotes: 0

Views: 1273

Answers (2)

Prayson W. Daniel
Prayson W. Daniel

Reputation: 15608

You can add them on your dataset as a new column.


# new data new_data (assumes DataFrame)
# You don’t have to pass new_data.values to Scikit-learn 
# Scikit-learn accepts DataFrame as it is

predictions = model.predict(new_data)

new_data['predictions'] = predictions 

print(new_data)

Upvotes: 1

Quang Hoang
Quang Hoang

Reputation: 150815

Do you want a new columns in your data:

learning['prediction'] = ynew

or do you want a dictionary:

d = {k:v for k,v in zip(learning['Team'], ynew)}

Upvotes: 1

Related Questions