Reputation: 608
I'm reading a .csv file in python using command as:
data = np.genfromtxt('home_data.csv', dtype=float, delimiter=',', names=True)
this csv has one column with zipcode which are numerals but in string format, for eg "85281". This column has values as nan:
data['zipcode']
Output : array([ nan, nan, nan, ..., nan, nan, nan])
How can I convert these values in string to integers so as to get an array of values and not of 'nan's.
Upvotes: 3
Views: 1861
Reputation: 18628
you must help genfromtxt
a little :
data = np.genfromtxt('home_data.csv',
dtype=[int,float],delimiter=',',names=True,
converters={0: lambda b:(b.decode().strip('"'))})
each field is collected as bytes. float(b'1\n') return 1.0 , but float(b'"8210"') give an error. the converters option allow to define for each field (here field 0) a function to do the proper conversion, here converting in string(decode) and removing (strip) the trailing "
.
If home_data.csv is :
zipcode,val
"8210",1
"8320",2
"14",3
you will obtain :
data -> array([(8210, 1.0), (8320, 2.0), (14, 3.0)], dtype=[('zipcode', '<i4'), ('val', '<f8')])
data['zipcode'] -> array([8210, 8320, 14])
Upvotes: 1
Reputation: 249
Maybe not the most efficient solution, but read your data as string
and convert it afterwards to float
:
data = np.genfromtxt('home_data.csv', dtype=float, delimiter=',', names=True)
zipcode = data['zipcode'].astype(np.float)
Btw., is there a reason you want to save a zipcode as a float
?
Upvotes: 1