Reputation: 2903
I am having some issues reading a csv to a dataframe, then when I convert to csv it will have extra decimals in it.
Currently using pandas 1.0.5 and python 3.7
For example consider the simple example below:
from io import StringIO
import pandas as pd
d = """ticker,open,close
aapl,108.922,108.583
aapl,109.471,110.25
aapl,113.943,114.752
aapl,117.747,118.825
"""
df = pd.read_csv(StringIO(d), sep=",", header=0, index_col=0)
print(df)
print("\n", df.to_csv())
The output is:
open close
ticker
aapl 108.922 108.583
aapl 109.471 110.250
aapl 113.943 114.752
aapl 117.747 118.825
ticker,open,close
aapl,108.92200000000001,108.583
aapl,109.471,110.25
aapl,113.943,114.75200000000001
aapl,117.74700000000001,118.825
as you can see there are extra zeroes added to the to_csv()
output. If I change the read_csv to have dtype=str like df = pd.read_csv(StringIO(d), sep=",", dtype=str, header=0, index_col=0)
then I would get my desired output, but I want the dtype to be decided by pandas, to be int64, or float depending on the column values. Instead of forcing all to be object/str.
Is there a way to eliminate these extra zeroes without forcing the dtype to str?
Upvotes: 0
Views: 795
Reputation: 5479
You can use the float-format argument:
d = """ticker,open,close
aapl,108.922,108.583
aapl,109.471,110.25
aapl,113.943,114.752
aapl,117.747,118.825
"""
df = pd.read_csv(StringIO(d), sep=",", header=0, index_col=0)
df.to_csv('output.csv',float_format='%.3f')
#This is how the output.csv file looks:
ticker,open,close
aapl,108.922,108.583
aapl,109.471,110.250
aapl,113.943,114.752
aapl,117.747,118.825
Upvotes: 2