user569548
user569548

Reputation: 61

Show tally after removing duplicates

Not sure if I should just break out database for this, but would be interesting to see another solution to this problem.

I have some lines of text in a text file like...

Bill
Bill
Pete
Mary
Mary
Mary

I didn't want duplicates and achieved it like so...

f = open('cgi/log/ipAddressList.log', 'r')
uniquelines = set(f.read().split("\n"))
for line in uniquelines:
    print line 

f.close()

Which gives me...

Bill
Mary 
Pete

but now I would like to tally how many instances they appeared in the text file like...

Bill (2)
Mary (3)
Pete (1)

Is there any kind of python magic that would do this? Thanks in advance.

Edit: Cool, I looked into collections and came up with,

f = open('cgi/log/ipAddressList.log', 'r')
c = collections.Counter( f.read().split("\n") )
uniquelines = set(c)

for line in uniquelines:
        print line + '%s (%d)' % (line, c[line])

f.close()

Just noticed the new comment about the readlines() so thanks for that too.

Here's my dictionary solution...

f = open('cgi/log/ipAddressList.log', 'r')
l = list( f.readlines() )
d = {}

for i in set(l):
    d[i] = l.count(i)

print d

Upvotes: 1

Views: 353

Answers (2)

juliomalegria
juliomalegria

Reputation: 24921

When you think about counting in Python, you (almost) all the time should be thinking about dictionaries. Here's a possible solution:

people = {}
for person in f:
    people[person] = people.get(person, 0) + 1
for person in people:
    print '%s (%d)' % person

Probably you won't need this here, but its better to use f.readlines() instead of doing the spliting yourself (f.read().split("\n")).

Upvotes: 0

Rik Poggi
Rik Poggi

Reputation: 29302

collections.Counter might do what you're looking for.

Example:

>>> from collections import Counter
>>> lst = ['Bill', 'Bill', 'Pete', 'Mary', 'Pete']
>>> c = Counter(lst)
>>> c
Counter({'Pete': 2, 'Bill': 2, 'Mary': 1})
>>> for k,v in c.items():
...     print(k,v)
...
Pete 2
Bill 2
Mary 1

You can apply this to your case with:

Counter(f.read().split("\n"))

Upvotes: 3

Related Questions