Reputation: 2257
I am reading csv using pandas to perform some analysis on it. Where I am getting this error
ValueError: could not convert string to float: 'none'
I checked, I am getting this error due to shift_zip
parameter. I manually went to csv file and openoffce and converted this column to numeric. But still it gives this error.
Data looks like this
I manually checked shift_zip
column but can not find none value in it.
I also tried to print this column data and it's data type, which gives <class int>
.
for val in data['nurse_zip']:
# print((val))
if type(val) != 'int':
print(type((val)))
output
<class 'int'>
<class 'int'>
<class 'int'>
How to I correctly identify which is none value in this column is causing this issue?
Edit 1: Adding more code for better understanding:
dataset = pd.read_csv("model__newdata.csv",header = 0)
#Data Pre-processing
data = dataset.drop('shift_location_id',1)
data = data.drop('status',1)
data = data.drop('city',1)
data = data.drop('open_positions',1)
# data = data.drop('shift_id',1)
# data = data.drop('role_id',1)
# data = data.drop('specialty_id',1)
# data = data.drop('years_of_experience',1)
# data = data.drop('shifts_zip',1)
# data = data.drop('nurse_zip',1)
# data = data.drop('shift_department_id',1)
# data = data.drop('shift_organization_id',1)
# data = data.drop('user_id',1)
#Find median for features having NaN
median_role_id, median_specialty_id = data['role_id'].median(),data['specialty_id'].median()
median_shift_id = data['shift_id'].median()
median_specialty_id = data['specialty_id'].median()
data['shift_id'].fillna(median_shift_id, inplace=True)
data['role_id'].fillna(median_role_id, inplace=True)
data['specialty_id'].fillna(median_specialty_id, inplace=True)
data['years_of_experience'].fillna(0, inplace=True)
data['shifts_zip'].fillna(0, inplace=True) #Gives none value error
data['nurse_zip'].fillna(0, inplace=True)
data['shift_department_id'].fillna(0, inplace=True)
data['shift_organization_id'].fillna(0, inplace=True)
data['user_id'].fillna(0, inplace=True)
print (data[data['nurse_zip'] == 'none'])
Output
Empty DataFrame
Columns: [shift_id, user_id, shift_organization_id, shift_department_id, role_id, specialty_id, years_of_experience, nurse_zip, shifts_zip]
Index: []
Edit 1
Result on jezrael answer
It give False
or True
as per condition. Can not check which particular row is none or empty.
Upvotes: 1
Views: 1181
Reputation: 863166
You can try:
#check string none
print (data[data['nurse_zip'] == 'none'])
#check non integer values
print (data[data['nurse_zip'].apply(type) != int])
#check strings values
print (data[data['nurse_zip'].apply(type) == str])
#check missing values values
print (data[data['nurse_zip'].isnull()])
Upvotes: 1
Reputation: 172
If finding the Na or null value is the objective then simply use
df.info()
and you will be able to see the datatype of the column as well as the None value count also.
But I think, in your dataset the value which making noise is not in null format.
You can give a try to below points.
1:Better you visualize the particular column using historical plot or any other plot.
2:Use df[column].astype to force change the dtype of column
Upvotes: 2