How to remove duplicate lines without creating empty lines in CSV file?

Question

CSV file:

a,b
a,c
a,d
a,b
a,a

Code widely recommended for removing duplicates:

import fileinput
seen = set()
for line in fileinput.FileInput('1.csv', inplace=1):
    if line in seen: continue

    seen.add(line)
    print(line)

Result obtained:

a,b

a,c

a,d

a,a

expected result:

a,b
a,c
a,d
a,a

What should I do to not create these lines during the process?

anarchy · Accepted Answer

The print function adds a new line at the end of each line, to change the behavior add the following argument like this.

import fileinput
seen = set()
for line in fileinput.FileInput('1.csv', inplace=1):
    if line in seen: continue

    seen.add(line)
    print(line,end='')

There are many other ways and other libraries you can use to achieve this, this post https://www.py4u.net/discuss/16763 covers the other methods quite well, you can go through all of them and see which one works the best for you.

How to remove duplicate lines without creating empty lines in CSV file?

Answers (2)

Related Questions