Pandas dataframe with int8 column showing inconsistent arithmetic (sum and product)

Question

I have a dataframe with an int8 column to ensure lower memory.

In [1]: df = pd.DataFrame({'a': [100, 50]}, dtype='int8')
        df
Out[1]:
     a
0   100
1   50

In [2]: df.dtypes
Out[2]: a    int8
        dtype: object

sum automatically promotes the result to int64 and gives the correct result.

In [3]: df.sum()
Out[3]:
a    150
dtype: int64

But a + or * operation does not do so.

In [4]: df.loc[0, 'a'] + df.loc[1, 'a']
C:\Users\bubai\AppData\Local\Temp\ipykernel_33164\1219674856.py:1: RuntimeWarning: overflow encountered in byte_scalars
  df.loc[0, 'a'] + df.loc[1, 'a']
Out[4]: -106

In [5]: df['a'] * 4
Out[5]: 0   -112
        1    -56
        Name: a, dtype: int8

So at one place pandas decides to automatically upcast the result whereas in other cases it does not. Is this an inconsistency in pandas or non-standard coding on my end? If I have such arithmetic operations in my code, how can I avoid the incorrect results?

Pandas dataframe with int8 column showing inconsistent arithmetic (sum and product)

Answers (1)

Related Questions