Reputation: 1249
I have a DataFrame with the datatype of the column is float16 which maximum value is 65536. When I call the sum() in pandas to sum all the value of this column, I get infinite "inf" values as the value exceeds the range.
This is a sample of the input data and output of sum:
Since the data type of the output value of the sum() function automatically follows the data type of the column, I would like to ask if there is any way to convert the value of sum in pandas to avoid the infinitive value?
Upvotes: 1
Views: 8422
Reputation: 8816
There is no solution so far , possible workaround may be as @Anton vBR. However there is already a bug with this When running reductions on dataframe columns of dtype float16, it into a surprising behaviour:
[Already a Bug opened for this on github[(https://github.com/pandas-dev/pandas/issues/22841)
Upvotes: 2
Reputation: 18916
The first that comes in mind is to pass a dtype=np.float64
param.
df.sum(axis=1,dtype=np.float64)
However this returns a ValueError:
ValueError: the 'dtype' parameter is not supported in the pandas implementation of sum()
Possible workaround:
Use np.sum()
, the underlying library to pandas, instead and pass dtype.
import pandas as pd
import numpy as np
df = pd.DataFrame({
'col1': [35000.0, 35000.0],
'col2': [35000.0, 35000.0]
})
df['col1'] = df['col1'].astype(np.float16)
df['col2'] = df['col2'].astype(np.float16)
#print(df.sum(axis=1)) # --> results in inf
#print(df.sum(axis=1,dtype=np.float64)) # --> results in error message
print(np.sum(df.values, dtype=np.float64, axis=1)) # --> works
Upvotes: 2