Reputation: 3591
I have a pandas dataframe of shape (455698, 62). I want to save it as a csv file, and load it again later with pandas. For now I do this :
df.to_csv("/path/to/file.csv",index=False,sep="\\", encoding='utf-8') #saving
df=pd.read_csv("/path/to/file.csv",delimiter="\\",encoding ='utf-8') #loading
and I get a dataframe with shape (455700, 62) : 2 more lines ? When I check in detail, (looking at all unique values in each columns), I found that some values changed columns in the process.
I've tried multiple separators, forcing dtype ="object", and I can't figure out where the bug is. What should I try?
Upvotes: 1
Views: 2080
Reputation: 210832
Is it possible that some of your strings contain new-line (\n
) character?
In this case i would suggest to use quoting when saving your CSV file:
import csv
df.to_csv("/path/to/file.csv",index=False,sep="\\", encoding='utf-8', quoting=csv.QUOTE_NONNUMERIC)
...
Upvotes: 5