Reputation: 141
Here is data.txt file like this:
{'wood', 'iron', 'gold', 'silver'}
{'tungsten', 'iron', 'gold', 'timber'}
I want to get two type of result like below:
#FIRST TYPE: sorted by item
gold: 33.3%
iron: 33.3%
silver: 16.7%
timber: 16.7%
tungsten: 16.7%
#SECOND TYPE: sorted by percentage
silver: 16.7%
timber: 16.7%
tungsten: 16.7%
gold: 33.3%
iron: 33.3%
I show my code for this result
import collections
counter = collections.Counter()
keywords = []
with open("data.txt") as f:
for line in f:
if line.strip():
for keyword in line.split(','):
keywords.append(keyword.strip())
counter.update(keywords)
for key in counter:
print "%s: %.1f%s" %(key, (counter[key]*1.0 / len(counter))*100, '%')
However my result show like this
'silver'}: 16.7%
'iron': 33.3%
....
I wan to get rid of curly brackets, apostrophe in the result.
How do I change or rewrite to show a result what I want ? I'll waiting for your help!!
Upvotes: 1
Views: 447
Reputation: 101959
Dictionaries/Counter
s/set
s are not ordered. You must first convert it to a list
and sort the list.
For example:
for key, val in sorted(counter.items()): #or with key=lambda x:x[0]
print "%s: %.1f%s" % (key, float(val) * 100 / len(counter), "%")
Prints the values sorted by key, while:
for key, val in sorted(counter.items(), key=lambda x: (x[1], x[0])):
print "%s: %.1f%s" % (key, float(val) * 100 / len(counter), "%")
Sorts them by percentage(if two items have the same percentage they are sorted also by name).
Update
Regarding your parsing problem you have to strip
also the {
and }
:
for line in f:
if line.strip():
for keyword in line.strip().strip('{}').split(','):
keyword = keyword.strip("'")
If you are using a recent python version(like 2.7 and/or 3) you can use ast.literal_eval
instead:
import ast
...
for line inf f:
stripped = line.strip()
if stripped:
for keyword in ast.literal_eval(stripped):
Note however that this will remove duplicate keys on the same line! (From your example this seems okay...)
Otherwise you could do:
import ast
...
for line inf f:
stripped = line.strip()
if stripped:
for keyword in ast.literal_eval('[' + stripped[1:-1] + ']'):
Which will preserve duplicates.
Upvotes: 2
Reputation: 19416
The reason for the stray {
and }
is that you are not getting rid of them.
To do that just change your for loop to something like:
for line in f:
line = line.strip().strip('{}') # get rid of curly braces
if line:
....
As far as printing is concerned:
print "Sorted by Percentage"
for k,v in sorted(c.items(), key=lambda x: x[1]):
print '{0}: {1:.2%}'.format(k, float(v)/len(c))
print
print "Sorted by Name"
for k,v in sorted(c.items(), key=lambda x :x[0]):
print '{0}: {1:.2%}'.format(k, float(v)/len(c))
Upvotes: 1
Reputation: 250951
Use sorted
to sort the items based on keys/percentage, because dicts don't have any order.
from collections import Counter
counter = Counter()
import ast
keywords = []
with open("abc") as f:
for line in f:
#strip {} and split the line at ", "
line = line.strip("{}\n").split(", ")
counter += Counter(x.strip('"') for x in line)
le = len(counter)
for key,val in sorted(counter.items()):
print "%s: %.1f%s" %(key, (val*1.0 / le)*100, '%')
print
for key,val in sorted(counter.items(), key = lambda x :(x[1],x[0]) ):
print "%s: %.1f%s" %(key, (val*1.0 / le)*100, '%')
output:
'gold': 33.3%
'iron': 33.3%
'silver': 16.7%
'timber': 16.7%
'tungsten': 16.7%
'wood': 16.7%
'silver': 16.7%
'timber': 16.7%
'tungsten': 16.7%
'wood': 16.7%
'gold': 33.3%
'iron': 33.3%
Upvotes: 1