Reputation: 1221
Suppose I have below CSV data:
col1,col2,col3,label
,1,2,label1
3,,4,label2
5,6,7,label3
What is the best way to read this data and convert col1 & col2 which would be float to int.
I am able to use this and convert my filtered dataframe which only has the numeric columns (col1,col2,col3). How can I modify the main dataframe itself ignoring the label column which is string?
On a related note, I could also use below command. Any idea how I could run it in a loop so that variable name col%d is dynamically generated, since I have 32 columns.
filter_df.col1 = filter_df.col1.fillna(0).astype(int)
Upvotes: 1
Views: 238
Reputation: 402323
You can use fillna
with downcast='infer'
.
m = df.dtypes == np.number
df.loc[:, m] = df.loc[:, m].fillna(0, downcast='infer')
print(df)
col1 col2 col3 label
0 0 1 2 NaN
1 3 0 4 label2
2 5 6 7 label3
Upvotes: 4
Reputation: 862511
Use select_dtypes
with np.number
:
print (filter_df)
col1 col2 col3 label
0 NaN 1.0 2 NaN
1 3.0 NaN 4 label2
2 5.0 6.0 7 label3
cols = filter_df.select_dtypes(np.number).columns
filter_df[cols] = filter_df[cols].fillna(0).astype(int)
print (filter_df)
col1 col2 col3 label
0 0 1 2 NaN
1 3 0 4 label2
2 5 6 7 label3
Upvotes: 5