Neil
Neil

Reputation: 8247

how to get numeric column names in pandas dataframe

I have pandas dataframe which has object,int64,float64 datatypes. I want to get column names for int64 and float64 columns. I am using following command in pandas,but it does not seem to work

cat_num_prv_app = [num for num in list(df.columns) if isinstance(num, (np.int64,np.float64))]

Following are my datatypes

 df.info()
 <class 'pandas.core.frame.DataFrame'>
 RangeIndex: 1670214 entries, 0 to 1670213
 Data columns (total 37 columns):
 ID               1670214 non-null int64
 NAME             1670214 non-null object
 ANNUITY          1297979 non-null float64
 AMOUNT           1670214 non-null float64
 CREDIT           1670213 non-null float64

I want to store column names ID,ANNUITY,AMOUNT and CREDIT in a variable,which I can use later to subset the dataframe.

Upvotes: 21

Views: 29638

Answers (2)

Teoretic
Teoretic

Reputation: 2533

Alternative solution using "np.where"
(uglier than approved answer though)

df.iloc[:, (np.where((df.dtypes == np.int64) | (df.dtypes == np.float64)))[0]].columns

Sample code:

import pandas as pd
import numpy as np

df = pd.DataFrame({"A": [1, 2, 3], "B": [1.0, 2.0, 3.0], "C": ["a", "b", "c"]})

print(df.iloc[:, (np.where((df.dtypes == np.int64) | 
                 (df.dtypes == np.float64)))[0]].columns)

> Index(['A', 'B'], dtype='object')

Upvotes: 0

jezrael
jezrael

Reputation: 863301

Use select_dtypes with np.number for select all numeric columns:

df = pd.DataFrame({'A':list('abcdef'),
                   'B':[4.5,5,4,5,5,4],
                   'C':[7.4,8,9,4,2,3],
                   'D':[1,3,5,7,1,0],
                   'E':list('aaabbb')})

print (df)
   A    B    C  D  E
0  a  4.5  7.4  1  a
1  b  5.0  8.0  3  a
2  c  4.0  9.0  5  a
3  d  5.0  4.0  7  b
4  e  5.0  2.0  1  b
5  f  4.0  3.0  0  b

print (df.dtypes)
A     object
B    float64
C    float64
D      int64
E     object
dtype: object

cols = df.select_dtypes([np.number]).columns
print (cols)
Index(['B', 'C', 'D'], dtype='object')

Here is possible specify float64 and int64:

df = pd.DataFrame({'A':list('abcdef'),
                   'B':[4.5,5,4,5,5,4],
                   'C':[7,8,9,4,2,3],
                   'D':[1,3,5,7,1,0],
                   'E':list('aaabbb')})

df['D'] = df['D'].astype(np.int32)
print (df.dtypes)
A     object
B    float64
C      int64
D      int32
E     object
dtype: object

cols = df.select_dtypes([np.int64,np.float64]).columns
print (cols)
Index(['B', 'C'], dtype='object')

Upvotes: 31

Related Questions