Reputation: 96284
Consider a Dataframe. I want to convert a set of columns to_convert
to categories.
I can certainly do the following:
for col in to_convert:
df[col] = df[col].astype('category')
but I was surprised that the following does not return a dataframe:
df[to_convert].apply(lambda x: x.astype('category'), axis=0)
which of course makes the following not work:
df[to_convert] = df[to_convert].apply(lambda x: x.astype('category'), axis=0)
Why does apply
(axis=0
) return a Series even though it is supposed to act on the columns one by one?
Upvotes: 11
Views: 13519
Reputation: 48919
Note that since pandas 0.23.0 you no longer apply
to convert multiple columns to categorical data types. Now you can simply do df[to_convert].astype('category')
instead (where to_convert
is a set of columns as defined in the question).
Upvotes: 4
Reputation: 128958
This was just fixed in master, and so will be in 0.17.0, see the issue here
In [7]: df = DataFrame({'A' : list('aabbcd'), 'B' : list('ffghhe')})
In [8]: df
Out[8]:
A B
0 a f
1 a f
2 b g
3 b h
4 c h
5 d e
In [9]: df.dtypes
Out[9]:
A object
B object
dtype: object
In [10]: df.apply(lambda x: x.astype('category'))
Out[10]:
A B
0 a f
1 a f
2 b g
3 b h
4 c h
5 d e
In [11]: df.apply(lambda x: x.astype('category')).dtypes
Out[11]:
A category
B category
dtype: object
Upvotes: 10