Reputation: 347
I have a question about my homework problem. Here is the problem: Write a program which reads a text file called input.txt which contains an arbitrary number of lines of the form ", " then records this information using a dictionary, and finally outputs to the screen a list of countries represented in the file and the number of cities contained.
For example, if input.txt contained the following:
New York, US
Angers, France
Los Angeles, US
Pau, France
Dunkerque, France
Mecca, Saudi Arabia
The program would output the following (in some order):
Saudi Arabia : 1
US : 2
France : 3
Here is my Code:
def addword(w,wcDict):
if w in wcDict:
wcDict[w] +=1
else:
wcDict[w]= 1
import string
def processLine(line, wcDict):
wordlist= line.strip().split(",")
for word in wordlist:
word= word.lower().strip()
word=word.strip(string.punctuation)
addword(wordlist[1], wcDict)
def prettyprint(wcDict):
valkeylist= [(val,key) for key,val in wcDict.items()]
valkeylist.sort(reverse = True)
for val,key in valkeylist:
print '%-12s %3d'%(key,val)
def main():
wcDict={}
fobj= open('prob1.txt','r')
for line in fobj:
processLine(line, wcDict)
prettyprint (wcDict)
main()
My code counts each country twice. Can you please help me?
Thank you
Upvotes: 4
Views: 177
Reputation: 1631
from collections import Counter as c
lines = (line.strip() for line in open("file.txt"))
data = (elem for elem in lines)
result = [two for one in data for two in one.split(",")]
c = Counter()
c(result)
I hope i answered your query
Upvotes: 0
Reputation: 49395
In the processLine
function, you have an extraneous for loop. wordlist
will always contain two entries, the city and the country. So the code inside your for loop (including addword
) will be executed twice -- you can just delete the for
statement entirely and it should work as you expect.
Upvotes: 2