pandas DataFrame shows ints as floats

Question

pandas DataFrame shows ints as floats. But I would like to show those ints as ints.

X_train = train.iloc[:, 1:].values.astype('float32')
y_train = train.iloc[:, 0].values.astype('uint8')
X = test.values.astype('float32')

So, the dtypes are 'float32', 'unit8' and 'float32'.

Show min and max values for X_train, y_train and X in a DataFrame (in Jupyter Notebook).

pd.DataFrame([[np.amin(X_train), np.amax(X_train)], 
              [np.amin(y_train), np.amax(y_train)], 
              [np.amin(X), np.amax(X)]], 
             columns = ['min', 'max'], 
             index = ['X_train', 'y_train', 'X'])

Output:

        min max
X_train 0.0 255.0
y_train 0.0 9.0
X       0.0 255.0

But I would expect:

        min max
X_train 0.0 255.0
y_train 0   9
X       0.0 255.0

But...

print(np.amax(y_train))

Outputs to 9 (not 9.0)

Any suggestions?

piRSquared · Accepted Answer

pandas types things by columns. So each column will have a specific dtype. It determines that up-casting the int is better so that the entire column can be float rather than keeping the column as dtype object.

df = pd.DataFrame([
    [0., 255.],
    [0, 9],
    [0., 255.]
])

df

     0      1
0  0.0  255.0
1  0.0    9.0
2  0.0  255.0

df.dtypes

0    float64
1    float64
dtype: object

Use dtype=object to retain the individual types.

df = pd.DataFrame([
    [0., 255.],
    [0, 9],
    [0., 255.]
], dtype=object)

df

   0    1
0  0  255
1  0    9
2  0  255

df.dtypes

0    object
1    object
dtype: object

df.applymap(type)

                 0                1
0    
1        
2

I'd only use this for reporting purposes. If you want to use this for further calculations, you lose many efficiencies. I'd spend time rearranging your data.

pandas DataFrame shows ints as floats

Answers (2)

Related Questions