Reputation: 143
I am working with dataframes in the pandas library. I have a table of data in Excel that I save as a CSV then I call
df = pd.read_csv("file.csv")
I expect the frame to look something like
Item1 Item2 Item3
0 12.00 3 2
1 4.00 8 4
2 3.14 2 8
But instead I get
Item1 Item2 Item3 Unnamed: 3 Unnamed: 4
0 12.00 3 2 NaN NaN
1 4.00 8 4 NaN NaN
2 3.14 2 8 NaN NaN
Or sometimes extra rows with all NaN
values. It appears that pandas
is not aware of the real size of the CSV. The data in Excel is organized perfectly fine, the data values are all nonempty and are entirely in a rectangle. How do I fix this? Is there an edit I can make to the CSV that will specify the its correct size?
As requested here is a snippet of the data. It goes down to about 2500 rows, and there are no more values to the right.
Upvotes: 3
Views: 2287
Reputation: 1549
You probably have a cell that is not empty (for example a space ) in the original Excel file. If you are getting 2 unnamed columns in pandas, try to delete 2 columns of the original Excel file.
Another way would be to keep all columns that are not unnamed
. You could do this with:
real_cols = [x for x in df.columns if not x.startswith("Unnamed: ")]
df = df[real_cols]
And then you can save the csv.
Upvotes: 1