user1179317
user1179317

Reputation: 2903

Pandas to_csv with extra zeroes

I am having some issues reading a csv to a dataframe, then when I convert to csv it will have extra decimals in it.

Currently using pandas 1.0.5 and python 3.7

For example consider the simple example below:

from io import StringIO
import pandas as pd


d = """ticker,open,close
aapl,108.922,108.583
aapl,109.471,110.25
aapl,113.943,114.752
aapl,117.747,118.825
"""


df = pd.read_csv(StringIO(d), sep=",", header=0, index_col=0)
print(df)
print("\n", df.to_csv())

The output is:

           open    close
ticker                  
aapl    108.922  108.583
aapl    109.471  110.250
aapl    113.943  114.752
aapl    117.747  118.825

ticker,open,close
aapl,108.92200000000001,108.583
aapl,109.471,110.25
aapl,113.943,114.75200000000001
aapl,117.74700000000001,118.825

as you can see there are extra zeroes added to the to_csv() output. If I change the read_csv to have dtype=str like df = pd.read_csv(StringIO(d), sep=",", dtype=str, header=0, index_col=0) then I would get my desired output, but I want the dtype to be decided by pandas, to be int64, or float depending on the column values. Instead of forcing all to be object/str.

Is there a way to eliminate these extra zeroes without forcing the dtype to str?

Upvotes: 0

Views: 795

Answers (1)

pakpe
pakpe

Reputation: 5479

You can use the float-format argument:

d = """ticker,open,close
aapl,108.922,108.583
aapl,109.471,110.25
aapl,113.943,114.752
aapl,117.747,118.825
"""

df = pd.read_csv(StringIO(d), sep=",", header=0, index_col=0)
df.to_csv('output.csv',float_format='%.3f')

#This is how the output.csv file looks:

ticker,open,close
aapl,108.922,108.583
aapl,109.471,110.250
aapl,113.943,114.752
aapl,117.747,118.825

Upvotes: 2

Related Questions