user9151863
user9151863

Reputation:

Linear Regression could not convert string to float

I am trying to find a relationship between UCAS points and Final university mark (Final) through linear regression, I am using This tutorial

I get the following error at

plt.scatter(X_test, Y_test,  color='black') 

could not convert string to float:

I have checked the type and "Total UCAS Points" column is of class str and "Final" is of type numpy.float64'

I have tried to convert the str to a float through doing the following:

pd.to_numeric("Total UCAS Points")

But keep getting the error message:

Unable to parse string "Total UCAS Points" at position 0

I have also tried to ignore the error but this does not seem to change the type to float and remains a str

here is a sample of my csv file:

Total UCAS Points: 280 280 240 240 360 360 360 360 630

Final: 58 46 62 64 48 56 54 30

df = df.replace(np.nan, -1)

X = df['Total UCAS Points']
Y = df['Final']

pd.to_numeric("Total UCAS Points")

print(type(Y[2]))


X=X.reshape(len(X),1)
Y=Y.reshape(len(Y),1)

# Split the data into training/testing sets
X_train = X[:-2500]
X_test = X[-2500:]

# Split the targets into training/testing sets
Y_train = Y[:-2500]
Y_test = Y[-2500:]

# Plot outputs
plt.scatter(X_test, Y_test,  color='black')

Upvotes: 1

Views: 2646

Answers (1)

Bill the Lizard
Bill the Lizard

Reputation: 405775

You need to pass a list of data to to_numeric, not a column name from your data frame. Try this:

X = pd.to_numeric(X)  # in place of pd.to_numeric("Total UCAS Points")

Upvotes: 3

Related Questions