David Koh
David Koh

Reputation: 1

Dataframe exported to CSV turns out different with data appearing in different columns then originally was

I'm trying to read a CSV as a dataframe, then sort by column and subsequently output the sorted dataframe into a new CSV. However, the problem is that my output CSV looks nothing like the sorted dataframe with data being moved to wrong columns etc etc. I suspect that the problem lies with the data as some columns are made up of long strings and might have special characters - this is because when I stripped out certain columns, the steps I took below does work. I have tried to export and reimport the dataframe in both dictionary and pickle format and it works perfectly.

First I read in a CSV file and then sort by a column (The csv files I used can be downloaded in the comment below (<100kb in size)

df = pd.read_csv("database.csv",encoding = "ISO-8859-1")
sorteddf = df.sort_values(by="All Comment Score")

This show how the dataframe looks after sorting (What I want)

Then I store my dataframe in a new CSV file and read that new CSV as a new dataframe:

sorteddf.to_csv("test.csv")
newdf = pd.read_csv("test.csv",encoding = "ISO-8859-1")

However, when I read the newly outputed CSV file as a new dataframe, the columns and the data appear to be a mess: This shows how the dataframe imported from the output CSV actually looks like

I would really appreciate it if someone could shed some light on my problem and point me in the right direction!

Upvotes: 0

Views: 2633

Answers (2)

A.Kot
A.Kot

Reputation: 7903

You have decoding/encoding issues. Your encoding is not in "ISO" its in 'latin-1'. Its hard to fix this unless you figure out why you are reading in your data like this.

Upvotes: 0

Roger Thomas
Roger Thomas

Reputation: 881

Are you talking about the unnamed column?

Try using sorteddf.to_csv('test.csv', index=False) This tells pandas not to output the inbuilt index column (most of the time you don't care about this)

Upvotes: 1

Related Questions