LoveMeow
LoveMeow

Reputation: 1181

Pandas reporting series to be an object when it's a decimal

I need an automated reliable way to find the data type of each column in a pandas data frame. I have been using .dtype() but have noticed something unexpected with it.

Consider this 10 row data frame:

df['a']
Out[6]: 
0    250.00
1    750.00
2      0.00
3      0.00
4      0.00
5      0.00
6      0.00
7      0.00
8      0.00
9      0.00
Name: a, dtype: object

type(df['a'][0])
Out[9]: decimal.Decimal

Why is the dtype of the entire column an 'object' when each entry is a decimal? I really need it to say decimal or float or something numeric. Any help would be appreciated!

Upvotes: 4

Views: 2836

Answers (1)

EdChum
EdChum

Reputation: 394101

This is not an error but is due to the numpy dtype representation: https://docs.scipy.org/doc/numpy/reference/arrays.scalars.html.

Basically as Decimal is not a principle inbuilt type then it's dtype ends up being object even though the actual type of each cell is still Decimal.

It's advised where possible to use the inbuilt scalar types, in this case float64, because arithmetic operations are unlikely to be vectorised even though the type may well be numerical.

The same is observed when you store str or datetime.date values, the dtype is object for these.

Upvotes: 8

Related Questions