Reputation: 111
The data frame has 906133 rows, such as:
df.shape
(906133, 24)
And I tried to save it as a csv file:
df.to_csv('df.csv',encoding='utf-8-sig',index=False)
Then read it again;
test_lines = pd.read_csv('df.csv')
However, it has now much more rows:
test_lines.shape
(16512050, 24)
After some observation, the extra lines mainly contain a series of dots (...........) or commas (,,,,,,,,,,,,,,,). If I put a sep = '\t' for both saving and reading command, the number of extra lines decreased, but still existed.
Upvotes: 3
Views: 1754
Reputation: 428
I got to a similar problem, however I was constructing the csv from scratch (not importing).
My blank lines disappeared after I used these parameters:
df.to_csv('df.csv', mode='w', encoding='utf-8', index=False, line_terminator='\n')
I blame the line_terminator to be be the culprit but the index parameter was responsible also for some extra separators. I hope this helps also on your side. As @Vishnudev wrote we do not have your dataset so we cannot test. If you submit, we can confirm.
Upvotes: 1