user1318135
user1318135

Reputation: 717

Pandas .describe() only returning 4 statistics on int dataframe (count, unique, top, freq)... no min, max, etc

Why could this be? My data seems pretty simple and straightforward, it's a 1 column dataframe of ints, but .describe only returns count, unique, top, freq... not max, min, and other expected outputs.

(Note .describe() functionality is as expected in other projects/datasets)

Upvotes: 11

Views: 12193

Answers (3)

Adem Ben Chaabene
Adem Ben Chaabene

Reputation: 11

try to change your features into numerical values to return all the statics you need :

df1['age'] = pd.to_numeric(df1['age'], errors='coerce')

Upvotes: 1

miguelfg
miguelfg

Reputation: 1524

Try:

df.agg(['count', 'nunique', 'min', 'max'])

You can add or remove the different aggregation functions to that list. And when I have quite a few columns I personally like to transpose it:

df.agg(['count', 'nunique', 'min', 'max']).transpose()

To reduce the aggregations on a subset of columns you different ways to do it.

  • By containig a word: example 'ID'

    df.filter(like='ID').agg(['count', 'nunique'])

  • By type of data:

    df.select_dtypes(include=['int']).agg(['count', 'nunique'])

    df.select_dtypes(exclude=['float64']).agg(['count', 'nunique'])

Upvotes: 2

frist
frist

Reputation: 1958

It seems pandas doesn't recognize your data as int.

Try to do this explicitly:

print(df.astype(int).describe())

Upvotes: 18

Related Questions