Quan Hoang
Quan Hoang

Reputation: 93

Unexpected type of values when using `df.loc`

Given a pandas DataFrame as follows

# python 3.8.2
import pandas as pd # 1.0.5

df = pd.DataFrame({'x': [0.5], 'y': [1]})

When I check types of two columns, they are float64 and int64 as expected.

print(df.dtypes)
# x    float64
# y      int64
# dtype: object

However, when extracting the value of y column in a row, I got two different types depending on how I use df.loc.

#1
_, y_val = df.loc[0, ['x', 'y']]
print(type(y_val)) # <class 'float'>, it is unexpected

#2
y_val = df.loc[0, 'y']
print(type(y_val)) # <class 'numpy.int64'>

I believe x column in the DataFrame causes the difference, but I don't know why. In addition, is it possible to use the #1 syntax and acquire y values as integers?

Any help would be welcome. Thanks in advance.

Upvotes: 2

Views: 457

Answers (1)

user13893607
user13893607

Reputation:

loc returns a Pandas series object when you select a row. When the series has a mix of types (int and float in this case) all the integer numbers are casted to float.

I guess the simplest solution is just to cast y_val to int:

_, y_val = df.loc[0, ['x', 'y']]
y_val = int(y_val)

Or you could select only the y column, so that you would get an integer directly:

y_val = df.loc[0, "y"]

Upvotes: 1

Related Questions