Reputation:
I'm creating a program that reads through a .txt file of names (lastname,firstname), one per line, and creates a dictionary that shows the number of times a specific first name repeats.
I've gotten the follow code so far but can't seem to accurately count the number of times a first name repeats. I think the problem is that my variable "value" doesn't correspond to the actual value in the key value pair. How can I fix that?
file = open('names.txt')
dict = {}
value = 1
for line in file:
listOfNames = line.split(",")
firstName = listOfNames[1]
if dict.has_key(firstName):
value += 1
else:
dict[firstName] = value
file.close()
Upvotes: 4
Views: 295
Reputation: 91132
with open('names.txt') as f:
firstNames = [line.split(',')[0] for line in f]
print collections.Counter(firstNames)
Upvotes: 2
Reputation: 16080
Use a defaultdict like this:
from collections import defaultdict
d = defaultdict(int)
for name in open('names.txt'):
_, first_name = name.split(",")
d[first_name] += 1
You may want to normalize your names by stripping whitespace and capitalization.
Upvotes: 2
Reputation: 38980
As @Aurora mentions, Counter is perfect for this.
>>> names = ['foo bar', 'foo baz', 'foo car', 'doo bar', 'doo baz', 'boo paz']
>>> from collections import Counter
>>> Counter(name.split()[1] for name in names)
Counter({'baz': 2, 'bar': 2, 'paz': 1, 'car': 1})
Upvotes: 2
Reputation: 226564
You can replace the if-block with:
dict[firstname] = dict.get(firstname, 0) + 1
Alternatively, you can use collections.Counter instead of a dict. That simplifies to counting code to just:
c[firstname] += 1
where c is a Counter instance.
Upvotes: 2
Reputation: 4414
You may be interested in the collections.Counter - which is a special dictionary for exactly this kind of task.
Upvotes: 6
Reputation: 994101
It looks like you want something like:
if dict.has_key(firstName):
dict[firstName] += 1
else:
dict[firstName] = 1
Also, I would strongly recommend you choose a name other than dict
, such as names
. The reason is that dict
is the name of the standard Python dictionary type (just like you usually don't want to create Python variables called str
, int
, or list
).
There are other solutions such as using collections.defaultdict
that will be more succinct.
Upvotes: 2