Reputation: 1197
I have a sentence "The quick fox jumps over the lazy dog", and I have counted the number of times each word occurs in this sentence. The output should be like this:
brown:1,dog:1,fox:1,jumps:1,lazy:1,over:1,quick:1,the:2
There should be no spaces between the characters in this output, and there should be commas between the words/numbers. The output from my program looks like this:
,brown:1,dog:1,fox:1,jumps:1,lazy:1,over:1,quick:1,the:2
I find that there is a comma place before 'brown'. Is there an easier way to print this?
filename = os.path.basename(path)
with open(filename, 'r+') as f:
fline = f.read()
fwords = fline.split()
allwords = [word.lower() for word in fwords]
sortwords = list(set(allwords))
r = sorted(sortwords, key=str.lower)
finalwords = ','.join(r)
sys.stdout.write(str(finalwords))
print '\n'
countlist = {}
for word in allwords:
try: countlist[word] += 1
except KeyError: countlist[word] = 1
for c,num in sorted(countlist.items()):
sys.stdout.write(",{:}:{:}".format(c, num))
Upvotes: 0
Views: 122
Reputation: 23251
A couple alternate ways of making the word list. First, a one-liner:
countlist = {word:allwords.count(word) for word in allwords}
As pointed out by DSM, that method can be slow with long lists. An alternate would be to use defaultdict
:
from itertools import defaultdict
countlist = defaultdict(int)
for word in allwords:
countlist[word] += 1
For output, join individual word counts with a ,
, which avoids having one at the beginning:
sys.stdout.write(",".join(["{:}:{:}".format(key, value) for key, value in countlist .items()]))
Upvotes: 1