gopal gopi
gopal gopi

Reputation: 11

ValueError: could not convert string to float: :'30/01/20'

i keep getting this error any appreciated

i been trying this but value error shows up https://colab.research.google.com/drive/1jEmsG9WWRpUmuU92URD0PxtzWkpETlY3?usp=sharing

from sklearn.naive_bayes import GaussianNB
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns; sns.set()

import datetime 




df = pd.read_csv('/content/covid_19_india.csv')
df['split'] = np.random.randn(df.shape[0], 1)

msk = np.random.rand(len(df)) <= 0.7

training = df[msk]
test = df[~msk]


xtrain = training.drop('Sno', axis=1)
ytrain = training.loc[:, 'Sno']
xtest = test.drop('Sno', axis=1)
ytest = test.loc[:, 'Sno']



model = GaussianNB()


model.fit(xtrain, ytrain)


pred = model.predict(xtest)


mat = confusion_matrix(pred, ytest)
names = np.unique(pred)
sns.heatmap(mat, square=True, annot=True, fmt='d', cbar=False,
            xticklabels=names, yticklabels=names)
plt.xlabel('Truth')
plt.ylabel('Predicted')

ValueError Traceback (most recent call last) in () 31 32 # Train the model ---> 33 model.fit(xtrain, ytrain) 34 35 # Predict Output

6 frames /usr/local/lib/python3.6/dist-packages/numpy/core/_asarray.py in asarray(a, dtype, order) 83 84 """ ---> 85 return array(a, dtype, copy=False, order=order) 86 87

ValueError: could not convert string to float: '30/01/20'

any help

Upvotes: 0

Views: 3099

Answers (2)

James Tollefson
James Tollefson

Reputation: 893

As has been mentioned in both the comments and the other answer to this question, you have a column of dates formatted as strings in your dataset. You have a couple options here.

Don't use the date column

For the sake of argument, let's say your dates are in a column named df['dates']. You can simply drop the date column if you do not want to use it.

df.drop('date', axis=1)

Convert dates to datetime format

Another option is to convert this column to datetime format. This can be done using apply() and datetime.datetime.strptime. If you're an aspiring data scientist you should read and then bookmark https://docs.python.org/3/library/datetime.html#strftime-strptime-behavior. It's pretty handy, I promise.

from datetime import datetime
df['date'] = df['date'].apply(lambda d:datetime.strptime(d, '%m/%d/%y')

Upvotes: 0

Valdi_Bo
Valdi_Bo

Reputation: 30991

The error message says that "30/01/20" string could not be converted to float.

So it seems that your DataFrame contains a column with dates.

Note that when read_csv reads source data, it attempts to convert numeric columns to either int or float, but other columns (which can not be converted this way) are left as strings and these columns have object data type.

Start from identification which column contains dates. Then, to convert this column to datetime, as early as at the reading phase, pass parse_dates parameter to read_csv, with a list of column names to be converted.

Then at least there should be no problem with conversion to float.

Upvotes: 0

Related Questions