Reputation: 20342
I know how to select numeric fields from one dataframe to another.
df1 = df.select_dtypes(include=[np.number])
I was thinking there should be a similar way to select categorical fields, as such.
df2 = df.select_dtypes(include=['category'])
Of course that doesn't work. Is there a way to do this? I have a data frame with float64 and object datatypes.
Also, I'm trying to split these into continuous and discrete types, and hopefully bin the continuous data points. The line below seems to work fine.
df1['price_bins'] = pd.cut(df1.PRICE, bins=15)
Is this the preferred way to do this kind of thing?
Upvotes: 1
Views: 9549
Reputation: 11
to find out categorical values just replace the word 'categorical' in your code to 'object'
df2 = df.select_dtypes(include=['object'])
Upvotes: 1
Reputation: 291
I have version 0.23.4 of pandas
installed, and this is the help documentation from select_dtypes()
:
Notes
- To select all numeric types, use
np.number
or'number'
- To select strings you must use the
object
dtype, but note that this will return all object dtype columns- See the
numpy dtype hierarchy <http://docs.scipy.org/doc/numpy/reference/arrays.scalars.html>
__- To select datetimes, use
np.datetime64
,'datetime'
or'datetime64'
- To select timedeltas, use
np.timedelta64
,'timedelta'
or'timedelta64'
- To select Pandas categorical dtypes, use
'category'
- To select Pandas datetimetz dtypes, use
'datetimetz'
(new in 0.20.0) or'datetime64[ns, tz]'
The second-to-last is the one you're looking for: use just 'category'
instead of ['category']
; i.e. don't wrap in square brackets.
Upvotes: 0