ElRudi
ElRudi

Reputation: 2324

Change type of pandas series/dataframe column inplace

TL;DR: I'd like to change the data types of pandas dataframe columns in-place.


I have a pandas dataframe:

df = pd.DataFrame({'a': [1,2,3], 'b': [4,5,6.1]})

Which by default gets its columns assigned 'int64' and 'float64' on my system:

df.dtypes
Out[172]: 
a      int64
b    float64
dtype: object

Because my dataframe will be very large, I'd like to set the column data types, after having created the dataframe, to int32 and float32. I know how I could do this:

df['a'] = df['a'].astype(np.int32)
df['b'] = df['b'].astype(np.float32)

or, in one step:

df = df.astype({'a':np.int32, 'b':np.float32})

and the dtypes of my dataframe are indeed:

df.dtypes
Out[180]: 
a      int32
b    float32
dtype: object

However: this seems clunky, having to reassign the series, esp. since many pandas methods have an inplace kwarg. Using this, however, doesn't seem to work (starting out with the same dataframe at the top):

df['a'].astype(np.int32, inplace=True)

df.dtypes
Out[187]: 
a      int64
b    float64
dtype: object

Is there something I'm overlooking here? Is this by design? The same behaviour is shown when working with Series instead of DataFrame objects.

Many thanks,

Upvotes: 28

Views: 34657

Answers (4)

Anmol
Anmol

Reputation: 681

pass Column names and their Datatype as a dictionary as an argument in .astype()

col_types = {'col_1':'type_1', 'col_4':'type_4'}
df = df.astype( col_types)

It will change the datatype of only that columns passed via dictionary

Upvotes: -1

keepAlive
keepAlive

Reputation: 6665

And what about

>>> df.__dict__.update(df.astype({'a': np.int32, 'b': np.float32}).__dict__)
>>> df.dtypes
a      int32
b    float32
dtype: object

?

Upvotes: 3

Philipp
Philipp

Reputation: 4799

You can write your own (still clunky) inplace versions:

def astype_inplace(df: pd.DataFrame, dct: Dict):
    df[list(dct.keys())] = df.astype(dct)[list(dct.keys())]

def astype_per_column(df: pd.DataFrame, column: str, dtype):
    df[column] = df[column].astype(dtype)

and use it like

astype_inplace(df, {'bool_col':'boolean'})

or

astype_per_column(df, 'bool_col', 'boolean')

Upvotes: 6

user13422231
user13422231

Reputation: 17

@ElRudi

As I read-the-fine-manual: copy=False might suit your need?

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.astype.html?highlight=astype#pandas.DataFrame.astype

Upvotes: -3

Related Questions