Chan
Chan

Reputation: 4311

How to preserve the format when writing to csv using pandas?

I have a text file like this:

id,name,sex,
1,Sam,M,
2,Ann,F,
3,Peter,
4,Ben,M,

Then, I read the file: df = pd.read_csv('data.csv')

After that, I write it to another file: df.to_csv('new_data.csv', index = False)

Then, I get

id,name,sex,Unnamed: 3
1,Sam,M,
2,Ann,F,
3,Peter,,
4,Ben,M,

You see that there are two commas instead of one in the fourth line.

How to preserve the format when using pd.to_csv?

Upvotes: 0

Views: 2088

Answers (2)

Mohit Motwani
Mohit Motwani

Reputation: 4792

The problem in your code is that you have a comma after the sex column in your file. So read_csv thinks it's a new column, which has no name and data.

df= pd.read_csv('data.csv')
df

id  name    sex Unnamed: 3
0   1   Sam M   NaN
1   2   Ann F   NaN
2   3   Peter   NaN NaN
3   4   Ben M   NaN

Hence you have an extra Unnamed column. So when you write the to_csv, it adds two empty values in the 3rd row and hence why, two ,.

Try:

df = pd.read_csv('data.csv', use_cols = ['id', 'name', 'sex'])
df.to_csv('new_data.csv', index = False)

Upvotes: 1

kabanus
kabanus

Reputation: 26005

pandas is preserving the format - the 3d row has no sex, and as such the csv should have an empty column - that is why you get to commas, since you are separating an empty column.

Your original text file was not a valid csv file.

What you want to do is something else, which is not write a valid csv file - you will have to do this yourself, I do not know of any existing method to create your format.

Upvotes: 1

Related Questions