Reputation: 771
I need to change the dtype of multiple columns (over 400) but the dataframe has different kind of dtypes. Some columns dtypes are float64
whereas some columns' are int64
or object
:
print my_df.dtypes
Output:
x1 int64
x2 int64
x3 object
x4 float64
x5 float64
x6 float64
x7 float64
...
x400 object
x401 object
x402 object
...
I need to change all int64
to int8
or int16
and also all float64
to float32
. I have tried below snippet, but it did not worked:
my_df[my_df.dtypes == np.int64].astype(np.int16)
my_df[my_df.dtypes == np.float64].astype(np.float32)
Any help is appreciated.
Thanks in advance.
Upvotes: 9
Views: 8070
Reputation: 51185
Setup
df = pd.DataFrame({'a': np.arange(5, dtype='int64'), 'b': np.arange(5, dtype='float64')})
Use select_dtypes
to get columns that match your desired type:
df.select_dtypes(np.float64) # or df.select_dtypes(np.float64).columns to save for casting
b
0 0.0
1 1.0
2 2.0
3 3.0
4 4.0
And cast as needed.
Upvotes: 1
Reputation: 189
You almost got it!
my_df.loc[:, my_df.dtypes == 'float64'] = my_df.loc[:, my_df.dtypes == 'float64'].astype('float32')
my_df.loc[:, my_df.dtypes == 'int64'] = my_df.loc[:, my_df.dtypes == 'int64'].astype('int32')
Upvotes: 7
Reputation: 771
Ok, I find my way :)
Find the columns that have dtype of float64
cols = my_df.select_dtypes(include=[np.float64]).columns
Then change dtype only the cols
of the dataframe.
my_df[cols] = my_df[cols].astype(np.float32)
Upvotes: 13
Reputation: 59304
You can build a mapping dictionary and use astype
new_types = {np.dtype(np.int64): np.int16,
np.dtype(np.float64): np.float32}
df = df.astype(df.dtypes.map(new_types).to_dict())
Example:
df = pd.DataFrame({'col1': [1,2,3], 'col2': [1.0,2.0,3.0]})
col1 col2
0 1 1.0
1 2 2.0
2 3 3.0
>>> df.dtypes
col1 int64
col2 float64
dtype: object
Then
df.dtypes.map({np.dtype(np.int64): np.int16, np.dtype(np.float64): np.float32}).to_dict()
Gives a dict of the new types
{'col1': numpy.int16, 'col2': numpy.float32}
Then just use astype
with this dict
>>> df.astype(df.dtypes.map(new_types).to_dict())
col1 int16
col2 float32
dtype: object
Upvotes: 2