diffracteD
diffracteD

Reputation: 758

how to sort out repeated entry from a column in python

i have a file like:

q12j4
q12j4
fj45j
q12j4
fjmep
fj45j

now all i wanted to do is:

I was trying to do it with defaultdictfunction but I think it will not work for strings.
please help..

Upvotes: 2

Views: 128

Answers (4)

NPE
NPE

Reputation: 500227

def unique(seq):
    seen = set()
    for val in seq:
        if val not in seen:
            seen.add(val)
            yield val

with open('file.txt') as f:
    print ''.join(unique(f))

As you can see, I've chosen to write a separate generator for removing duplicates from an iterable. This generator, unique(), can be used in lots of other contexts too.

Upvotes: 3

sehe
sehe

Reputation: 392893

This should be roughly enough:

with open('file.txt', 'r') as f:
    for line in set(f):
        print line

Upvotes: 3

pygabriel
pygabriel

Reputation: 10008

You should use the itertools.groupby function, for an example of usage, look at the standard library or this related question: How do I use Python's itertools.groupby()?

Assume that toorder is your list with repeated entries:

import itertools
toorder = ["a", "a", "b", "a", "b", "c"]

for key, group in itertools.groupby(sorted(toorder)):
    print key

Should output:

a
b
c

Upvotes: 0

eumiro
eumiro

Reputation: 212835

seen = set()
with open(filename, 'r') as f:
    for line in f:
        if line not in seen:
            print line
            seen.add(line)

Upvotes: 2

Related Questions