Reputation: 57
I'm not even sure I am wording this correctly, but I'm having a hard time wrapping my head around this. I have a data set of groups, descriptions, individuals and numbers. Some individuals can be in different groups. Some can have the same description. An example may look like this:
GROUP A DESCRIPTION A PERSON A NUMBER
GROUP A DESCRIPTION A PERSON B NUMBER
GROUP B DESCRIPTION A PERSON C NUMBER
GROUP C DESCRIPTION B PERSON A NUMBER
What I am attempting to accomplish is getting a certain percentage for each person in a group/description. So first, I loop through the data and add to an array. I then use that to create a defaultdict.
for row in data:
l.append([group, description, person, number])
d = defaultdict(int)
for item in l:
d[item[0], item[1]] += item[2]
for k,v in d.iteritems():
print k,v
>>(group, description) (sum of numbers)
What I need to do from here is where I get confused. Here's an actual example I am using:
GROUP A DESCRIPTION A PERSON A 1.14
GROUP A DESCRIPTION A PERSON B 1.14
GROUP A DESCRIPTION A PERSON C 0.36
GROUP A DESCRIPTION A PERSON D 1.07
So I get the sum of those figures, 3.71. My next step is taking a single person in that group, and divide their number by the total of their group. Using PERSON C as an example in that group above, then I would get 0.36/3.71 = 0.097. I am not sure how to put this into my code, but it seems like it shouldn't be difficult at all -- but I'm just not seeing it. I have several other steps after this, but I think once I know how to obtain this particular percentage, I can figure the rest out.
Upvotes: 0
Views: 136
Reputation: 12208
from collections import namedtuple
personEntry = namedtuple('entry', ['group', 'description', 'person', 'data')
# allEntries is a list in personEntries
groupSum = lambda groupKey: sum ([i.data for i in allEntries if i.group == groupKey])
groupTotals = {}
for key in ['Group A', 'Group B', 'Group C']:
groupTotals[key] = groupSum[key]
percentage = lambda entry: entry.data / groupTotals[entry.group]
for eachEntry in allEntries:
print eachEntry.person, percentage(eachEntry)
Upvotes: 0
Reputation: 43235
data = [
['GROUP A' , 'DESCRIPTION A' , 'PERSON A' , 1.14],
['GROUP A' , 'DESCRIPTION A', 'PERSON B', 1.14],
['GROUP A' , 'DESCRIPTION A' , 'PERSON C', 0.36],
['GROUP A' , 'DESCRIPTION A' , 'PERSON D', 1.07],
]
total_score = sum([x[3] for x in data])
target_person = 'PERSON C'
the_score = [ x[3]/total_score for x in data if x[2] == target_person]
print(the_score)
Upvotes: 1