Reputation: 1677
I am trying to write numerical strings to a csv and reading it back in as a dataframe later on. However pandas automatically converts my strings on reading from object
type to int64
type.
df = pandas.DataFrame({'col1':['00123','00125']})
print(df['col1'].dtype)
df.to_csv('test.csv',index=False)
new_df = pandas.read_csv('test.csv')
print(new_df['col1'].dtype)
object #value of first print
int64 #value of second print
How do I, either preserve the dtype on write or prevent the change on read?
EDIT: I noticed that if I use astype('|S')
on df new_df will now be an object type. even though df.dtype does not change. This does not seem intuitive to me. If anyone can explain this to me I would appreciate it.
df = pandas.DataFrame({'col1':['00123','00125']})
df['col1']=df['col1'].astype('|S')
print(df['col1'].dtype)
df.to_csv('test.csv',index=False)
new_df = pandas.read_csv('test.csv')
print(new_df['col1'].dtype)
object #value of first print
object #value of second print
Upvotes: 0
Views: 76
Reputation: 323226
I will recommend write those type df to excel
df.to_excel('test.xlsx',index=False)
Or pass the columns type while you reading the file
pd.read_csv('test.csv',dtype = {'col1': object})
Out[346]:
col1
0 00123
1 00125
Upvotes: 1