Oliver
Oliver

Reputation: 572

Issue while writing/reading a dataframe to csv - in final shape - Python

I am reading a large database into multiple dataframes. Which works every time. So I have individual dataframes. Then, I write each dataframe into a csv file. Initially it has 34 columns. After this, I read the csv file into a new dataframe and now I have 35 columns.

I did this for writing into the csv file:

df.to_csv(path + "file_01.csv")

And this for reading from it:

import pandas as ps
df = ps.read_csv(path + "file_01.csv")

I test their columns number by this:

df.shape

Why is it happening and how can I improve\make it work properly?

Upvotes: 1

Views: 558

Answers (4)

Terry
Terry

Reputation: 2811

As the other answers have already explained, the index is being saved together in the .csv file. If the index value is important and needs to be saved, you can edit only the .read_csv() function by adding the parameter index_col = 0

df = ps.read_csv(path + "file_01.csv", index_col=0)

Upvotes: 1

Andrew Mascillaro
Andrew Mascillaro

Reputation: 948

When you write to csv in pandas, the index column is placed to the left of the data columns in the csv. To remove the index from the csv, you can use the index=False argument.

df.to_csv(path + "file_01.csv", index=False)

Upvotes: 1

maede rayati
maede rayati

Reputation: 786

According to the documentation here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html

to_csv will write the index. The index will work as a new column. to disable that set index=false

Upvotes: 1

Ash Ishh
Ash Ishh

Reputation: 578

Default value of index argument of to_csv is true which results in export of additional index column.

You can do df.to_csv(path + "file_01.csv", index=False) to exclude index column from being appended.

Documentation: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html

Upvotes: 1

Related Questions