biendltb
biendltb

Reputation: 1249

pandas: sum() return the infinite value

I have a DataFrame with the datatype of the column is float16 which maximum value is 65536. When I call the sum() in pandas to sum all the value of this column, I get infinite "inf" values as the value exceeds the range.

This is a sample of the input data and output of sum:

input sample and output

Since the data type of the output value of the sum() function automatically follows the data type of the column, I would like to ask if there is any way to convert the value of sum in pandas to avoid the infinitive value?

Upvotes: 1

Views: 8422

Answers (2)

Karn Kumar
Karn Kumar

Reputation: 8816

There is no solution so far , possible workaround may be as @Anton vBR. However there is already a bug with this When running reductions on dataframe columns of dtype float16, it into a surprising behaviour:

[Already a Bug opened for this on github[(https://github.com/pandas-dev/pandas/issues/22841)

Upvotes: 2

Anton vBR
Anton vBR

Reputation: 18916

The first that comes in mind is to pass a dtype=np.float64 param.

df.sum(axis=1,dtype=np.float64)

However this returns a ValueError:

ValueError: the 'dtype' parameter is not supported in the pandas implementation of sum()


Possible workaround:

Use np.sum(), the underlying library to pandas, instead and pass dtype.

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'col1': [35000.0, 35000.0],
    'col2': [35000.0, 35000.0]
})

df['col1'] = df['col1'].astype(np.float16)
df['col2'] = df['col2'].astype(np.float16)

#print(df.sum(axis=1)) # --> results in inf 
#print(df.sum(axis=1,dtype=np.float64)) # --> results in error message
print(np.sum(df.values, dtype=np.float64, axis=1)) # --> works

Upvotes: 2

Related Questions