skynyrd
skynyrd

Reputation: 982

Pandas converts int values to float in dataframe

I wrote a script that takes a csv file as an input, manipulates the data by using pandas and creates another csv file.

Everything is OK, however pandas converts integer values to double by default. e.g.

in csv before:

5f684ee8-7398-914d-9d87-7b44c37ef081,France,44,72000,No,isBool("true")

in csv after:

E84E685F-9873-4D91-9D87-7B44C37EF081,France,44.0,72000.0,No,True

Here 44 and 72000 are changed to 44.0 and 72000.0

I know how to turn them into int using apply() in dataframe, however this script is going to be generic and I am looking to configure pandas at first.

Basically, I expect pandas not to put .0 if it is not a floating number.

Thanks.

Upvotes: 6

Views: 8160

Answers (2)

elexis
elexis

Reputation: 812

Similar to B. M.'s answer, you can parse your floats like the following:

df.to_csv(float_format="%.10g")

This will force numbers to be written without exponent if they have a precision of at most 10 digits. so 2,147,483,647 will render as 2147483647 and 10-2 will render as 0.01. You will run into issues if you have big integers (bigger than 10 digits) as these will be rendered as exponents instead.

Upvotes: 2

B. M.
B. M.

Reputation: 18628

As said in comments, some operations in pandas can change dtypes. see for exemple this page.

A solution can be :

df.to_csv(float_format="%.0f")

which round every (false) float to an integer format.

An exemple :

In [355]: pd.DataFrame(columns=list(range(6)), 
data=[['E84E685F-9873-4D91-9D87-7B44C37EF081', 'France', 44.0, 72000, 'No', True]]
).to_csv(float_format='%.f')
Out[355]: ',0,1,2,3,4,5\n0,E84E685F-9873-4D91-9D87-7B44C37EF081,France,44,72000,No,True\n'

Upvotes: 2

Related Questions