amit parajapat
amit parajapat

Reputation: 1

why we not able to print column name which have dtype=='object'

import pandas as pd

train =pd.read_csv("https://datahack.analyticsvidhya.com/media/workshop_train_file/train_gbW7HTd.csv")

train[train.dtypes=='object']
IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match

Upvotes: 0

Views: 1410

Answers (2)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210872

You can use DataFrame.select_dtypes() method:

train.select_dtypes(['object'])

to select all non-numeric columns (strings, datetimes, etc.):

train.select_dtypes(exclude='number')

Demo:

In [92]: train.select_dtypes(['object']).head(2)
Out[92]:
          Workclass  Education      Marital.Status       Occupation   Relationship   Race   Sex Native.Country  \
0         State-gov  Bachelors       Never-married     Adm-clerical  Not-in-family  White  Male  United-States
1  Self-emp-not-inc  Bachelors  Married-civ-spouse  Exec-managerial        Husband  White  Male  United-States

  Income.Group
0        <=50K
1        <=50K

In [93]: train.select_dtypes(exclude='number').head(2)
Out[93]:
          Workclass  Education      Marital.Status       Occupation   Relationship   Race   Sex Native.Country  \
0         State-gov  Bachelors       Never-married     Adm-clerical  Not-in-family  White  Male  United-States
1  Self-emp-not-inc  Bachelors  Married-civ-spouse  Exec-managerial        Husband  White  Male  United-States

  Income.Group
0        <=50K
1        <=50K

Upvotes: 1

Ian Thompson
Ian Thompson

Reputation: 3295

I think you are looking for .loc. Try this:

df.loc[:, df.dtypes == 'object'].head()

output

Or if you just want the column names:

df.columns[df.dtypes == 'object']

Upvotes: 1

Related Questions