Group by data based on one field of list

Question

I have input data like this

input = ((1,'MCA', 'Science'),(2,'physic', 'Science'),(3,'chemsitry', 'Science'),(4,'punjabi', 'arts'),(5,'hindi', 'arts'))

I want to group this data by third field (Science/arts) like this

result = {"arts":[{"id":"4","name":"punjabi"},{"id":"5","name":"hindi"}],"Science":[{"id":"1","name":"MCA"},{"id":"2","name":"physics"},{"id":"3","name":"chemistry"}]}

How can I achieve this in an efficient way?

TigerhawkT3 · Accepted Answer

I would recommend a collections.defaultdict. Iterate over your original data and assign new dictionaries to this defaultdict.

import collections
result = collections.defaultdict(list)
i = ((1,'MCA', 'Science'),(2,'physic', 'Science'),(3,'chemsitry', 'Science'),(4,'punjabi', 'arts'),(5,'hindi', 'arts'))
for id,name,subject in i:
    result[subject].append({'id':str(id), 'name':name})

You end up with the following result:

>>> result
defaultdict(, {'Science': [{'name': 'MCA', 'id': '1'}, {'name': 'physic', 'id': '2'}, {'name': 'chemsitry', 'id': '3'}], 'arts': [{'name': 'punjabi', 'id': '4'}, {'name': 'hindi', 'id': '5'}]})

It doesn't affect the algorithm, but remember to double-check the content's spelling before putting it into the program (viz 'chemsitry' et al.).

Group by data based on one field of list

Answers (2)

Related Questions