Matt Camp
Matt Camp

Reputation: 1528

Pandas DataFrame dtype is Int64 returns Float64

Tryign to figure out why Pandas is returning a float when the data field is an int. Is there a way around this? Trying to output some CQL commands and this keeps messing me up. Thanks

df = pd.DataFrame([[11001, 28154, 2457146.7149722599, 37.070666000000003],
[110, 28154, 2457146.7149722599, 37.070666000000003],
[1100, 28154, 2457146.7149722599, 37.070666000000003],
[110, 28, 2457146.7149722599, 37.070666000000003]])
print("\nNote: the first two fields are int64")
print(df.dtypes)
print("\nPrinting the first record of the first field returns an int... GOOD!")
print(df.iloc[0,0])
print("\nSaving the first row off and printing the first fields data returns a float... BAD!")
row1 = df.iloc[0]
print(row1[0])

Note: the first two fields are int64
0      int64
1      int64
2    float64
3    float64
dtype: object

Printing the first record of the first field returns an int... GOOD!
11001

Saving the first row off and printing the first fields data returns a float... BAD!
11001.0

Upvotes: 2

Views: 1887

Answers (1)

piRSquared
piRSquared

Reputation: 294536

A series has a dtype. A dataframe is a collection of series where each column is a separate series and has its own dtype. df.loc[0] grabs a row. This row was not a series on its own. Pandas converts it to a series but now has to assign a dtype. Since other elements of this row were float, the int gets upcast to float.

Upvotes: 3

Related Questions