Chris
Chris

Reputation: 57

Grouping items in Python dictionary, then singling out those items within the groups

I'm not even sure I am wording this correctly, but I'm having a hard time wrapping my head around this. I have a data set of groups, descriptions, individuals and numbers. Some individuals can be in different groups. Some can have the same description. An example may look like this:

GROUP A       DESCRIPTION A       PERSON A       NUMBER
GROUP A       DESCRIPTION A       PERSON B       NUMBER
GROUP B       DESCRIPTION A       PERSON C       NUMBER
GROUP C       DESCRIPTION B       PERSON A       NUMBER

What I am attempting to accomplish is getting a certain percentage for each person in a group/description. So first, I loop through the data and add to an array. I then use that to create a defaultdict.

for row in data:
    l.append([group, description, person, number])

d = defaultdict(int)
for item in l:
    d[item[0], item[1]] += item[2]

for k,v in d.iteritems():
    print k,v

>>(group, description) (sum of numbers)

What I need to do from here is where I get confused. Here's an actual example I am using:

GROUP A       DESCRIPTION A       PERSON A       1.14
GROUP A       DESCRIPTION A       PERSON B       1.14
GROUP A       DESCRIPTION A       PERSON C       0.36
GROUP A       DESCRIPTION A       PERSON D       1.07

So I get the sum of those figures, 3.71. My next step is taking a single person in that group, and divide their number by the total of their group. Using PERSON C as an example in that group above, then I would get 0.36/3.71 = 0.097. I am not sure how to put this into my code, but it seems like it shouldn't be difficult at all -- but I'm just not seeing it. I have several other steps after this, but I think once I know how to obtain this particular percentage, I can figure the rest out.

Upvotes: 0

Views: 136

Answers (2)

theodox
theodox

Reputation: 12208

from collections import namedtuple 
personEntry = namedtuple('entry', ['group', 'description', 'person', 'data')

# allEntries is a list in personEntries
groupSum = lambda groupKey: sum ([i.data for i in allEntries if i.group == groupKey])

groupTotals = {}
for key in ['Group A', 'Group B', 'Group C']:
    groupTotals[key] = groupSum[key]

percentage = lambda entry: entry.data / groupTotals[entry.group]

for eachEntry in allEntries:
    print eachEntry.person, percentage(eachEntry)

Upvotes: 0

DhruvPathak
DhruvPathak

Reputation: 43235

data = [
['GROUP A'  ,     'DESCRIPTION A'      , 'PERSON A'  ,       1.14],
['GROUP A'  ,     'DESCRIPTION A',       'PERSON B',       1.14],
['GROUP A'  ,     'DESCRIPTION A' ,      'PERSON C',       0.36],
['GROUP A'  ,     'DESCRIPTION A'  ,     'PERSON D',       1.07],
]


total_score = sum([x[3] for x in data])
target_person = 'PERSON C'
the_score = [ x[3]/total_score for x in data if x[2] == target_person]
print(the_score)

Upvotes: 1

Related Questions