Reputation:
I have this rating prediction model using linear regression
status = pd.DataFrame({'rating': [10.5,20.30,30.12,40.24,50.55,60.6,70.2], 'B': ['Bad','Not bad','Good','I like it','Very good','The best','Deserve an oscar']})
x = status.iloc[:,:-1].values
y = status.iloc[:,-1].values
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(x,y,train_size=0.4,random_state=0)
from sklearn.linear_model import LinearRegression
lr = LinearRegression()
lr.fit(x,y)
input = 40.24
lr.predict([[input]])
So I have 40.24 as my input for X value I was expecting for 'I like it' as the output but it throws error instead because the expected output is a string, here's the error: ValueError: could not convert string to float: 'Bad'
. How do I make it capable of having string as output?
Upvotes: 0
Views: 1073
Reputation: 979
Hi thats because sckitlearn or rather machine learning labels require numbers as an input, i am not sure what the classes are in this case but you can use the onehotencoder from sckitlearn
Also do change it to logistic regression
from sklearn.preprocessing import OneHotEncoder
from sklearn.linear_model import LogisticRegression
# 1. INSTANTIATE
enc = OneHotEncoder()
# 2. FIT
enc.fit(y)
# 3. Transform
onehotlabels = enc.transform(y).toarray()
onehotlabels.shape
clf = LogisticRegression(random_state=0).fit(x, onehotlabels)
or you can just manually map it out which ever way you prefer (e.g Bad -> 0, Good -> 1)
Upvotes: 1
Reputation: 412
Your labels are categorical where regression labels should be continuous numerical. You can consider to see it as a classification problem rather than regression.
Upvotes: 0
Reputation: 86
You cannot do a Linear Regression
if you have Target feature as a Categorical D-Type
.
That is the first rule of performing a Linear Regression that you should have Continuous Target feature as the y=mx+c
function only takes in numbers as input and tests the function against the numerical items and predicts the numerical item.
That is why it gets trained but fails to predict.
You need to encode your target feature. Please self-study these concepts.
Hope this helps.
Upvotes: 0