Reputation: 3720
I have a dataframe with many (But not all) Decimal128 columns (taken from a mongodb collection). I can't perform any math or comparisons on them (e.g. '<' not supported between instances of 'Decimal128' and 'float').
What is the quickest/easiest way to convert all these to float or some simpler built-in type that i can work with?
There is the Decimal128 to_decimal() method, and pandas astype(), but how can I do it for all (the decimal128) columns in one step/helper method?
Edit, I've tried:
testdf = my_df.apply(lambda x: x.astype(str).astype(float) if isinstance(x, Decimal128) else x)
testdf[testdf["MyCol"] > 80].head()
but I get:
TypeError: '>' not supported between instances of 'Decimal128' and 'int'
Converting a single column using .astype(str).astype(float) works.
Upvotes: 1
Views: 19395
Reputation: 4792
Just use:
df = df.astype(float)
You can also use apply or applymap(applying element wise operations), although these are inefficient compared to previous method.
df = df.applymap(float)
I can't reproduce a Decimal128 number in my system. Can you please check if the next line works for you?
df = df.apply(lambda x: x.astype(float) if isinstance(x, bson.decimal.Decimal128) else x)
It will check if a column is of type Decimal128 and then convert it to float.
Upvotes: 1
Reputation: 1824
Casting full DataFrame.
df = df.astype(str).astype(float)
For single column. IDs is the name of the column.
df["IDs"] = df.IDs.astype(str).astype(float)
Test implementation
from pprint import pprint
import bson
df = pd.DataFrame()
y = []
for i in range(1,6):
i = i *2/3.5
y.append(bson.decimal128.Decimal128(str(i)))
pprint(y)
df["D128"] = y
df["D128"] = df.D128.astype(str).astype(float)
print("\n", df)
Output:
[Decimal128('0.5714285714285714'),
Decimal128('1.1428571428571428'),
Decimal128('1.7142857142857142'),
Decimal128('2.2857142857142856'),
Decimal128('2.857142857142857')]
D128
0 0.571429
1 1.142857
2 1.714286
3 2.285714
4 2.857143
Upvotes: 5