Reputation: 53
I currently have a dataframe with n number of number-value columns and three columns that are datetime and string values. I want to convert all the columns (but three) to numeric values but am not sure what the best method is. Below is a sample dataframe (simplified):
df2 = pd.DataFrame(np.array([[1, '5-4-2016', 10], [1,'5-5-2016', 5],[2, '5-
4-2016', 10], [2, '5-5-2016', 7], [5, '5-4-2016', 8]]), columns= ['ID',
'Date', 'Number'])
I tried using something like (below) but was unsuccessful.
exclude = ['Date']
df = df.drop(exclude, 1).apply(pd.to_numeric,
errors='coerce').combine_first(df)
The expected output: (essentially, the datatype of fields 'ID' and 'Number' change to floats while 'Date' stays the same)
ID Date Number
0 1.0 5-4-2016 10.0
1 1.0 5-5-2016 5.0
2 2.0 5-4-2016 10.0
3 2.0 5-5-2016 7.0
4 5.0 5-4-2016 8.0
Upvotes: 0
Views: 996
Reputation: 25239
You need to call to_numeric
with option downcast='float'
, if you want it change to float. Otherwise, it will be int
. You also need to join back to non-converted columns of the original df2
df2[exclude].join(df2.drop(exclude, 1).apply(pd.to_numeric, downcast='float', errors='coerce'))
Out[1815]:
Date ID Number
0 5-4-2016 1.0 10.0
1 5-5-2016 1.0 5.0
2 5-4-2016 2.0 10.0
3 5-5-2016 2.0 7.0
4 5-4-2016 5.0 8.0
Upvotes: 0
Reputation: 131
Have you tried Series.astype()?
df['ID'] = df['ID'].astype(float)
df['Number'] = df['Number'].astype(float)
or for all columns besides date:
for col in [x for x in df.columns if x != 'Date']:
df[col] = df[col].astype(float)
or
df[[x for x in df.columns if x != 'Date']].transform(lambda x: x.astype(float), axis=1)
Upvotes: 1