mbohde
mbohde

Reputation: 139

How do I prevent pandas from writing a new column when I save to csv

I wrote this code just so show the example that I'm having. I need to save the data I have to a csv then reopen it later but when I reload the data into a pandas dataframe from csv it now has an extra unnamed column at the front that I don't want and it's messing up my data when I try to do .drop_duplicates() because each row now has its own number and every I reopen it from a csv it will have a new row of number at the front, just making everything worse. How do I make it so it doesn't have this?

import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.randn(100,4), columns=list('ABCD'))

df.to_csv('data.csv')
print(df.head())
df1 = pd.read_csv('data.csv')

print(df1.head())

Upvotes: 1

Views: 1089

Answers (3)

mbohde
mbohde

Reputation: 139

The solution was super easy. I needed to do

df.to_csv('data.csv', index= False)

Upvotes: 0

tdelaney
tdelaney

Reputation: 77357

Its the dataframe index. You can turn that off with

df.to_csv('data.csv', index=False)

The docs are the first stop to learn the different options you have when writing. pandas.DataFrame.to_csv

Upvotes: 1

wasif
wasif

Reputation: 15488

While reading, you can prevent columns with empty rows like:

df = pd.read_csv("data.csv").dropna()

Upvotes: 0

Related Questions