Craig Ellis
Craig Ellis

Reputation: 35

Python list of lists manipulation

I have a Python problem that can be solved with multiple nested for loops but I was wondering if there is an easier way to solve this, maybe by adding list items together and dropping duplicates.

My list looks like this:

main_list = [["[email protected]", "Administration", "100"],
             ["[email protected]", "Testing", "30"],
             ["[email protected]", "Development", "45"],
             ["[email protected]", "Development", "90"],
             ["[email protected]", "Development", "35"],
             ["[email protected]", "Development", "400"],
             ["[email protected]", "Administration", "95"],
             ["[email protected]", "Testing", "200"]]

I need to merge the email address and category (the first two list elements) and add the duplicate 3rd entries together.

So [user2, development] goes from:

["[email protected]", "Development", "45"],
["[email protected]", "Development", "90"],
["[email protected]", "Development", "35"],

to:

["[email protected]", "Development", "170"]

It this possible with list manipulation?

Thank you!

Upvotes: 2

Views: 117

Answers (4)

VegaS
VegaS

Reputation: 1

Exemplified, step by step.

main_dict = {}
for email, category, value in main_list:
    token = (email, category)
    if token in main_dict:
        main_dict[token] += int(value)
    else:
        main_dict[token] = int(value)

main_list_converted = []
for k, v in main_dict.iteritems():
    main_list_converted.append(list(k) + [v])

main_list_converted.sort()

"""
for item in main_list_converted:
    print (item)

[['[email protected]', 'Administration', 100]
['[email protected]', 'Development', 170]
['[email protected]', 'Testing', 30]
['[email protected]', 'Administration', 95]
['[email protected]', 'Development', 400]
['[email protected]', 'Testing', 200]]
"""

Upvotes: 0

Alex
Alex

Reputation: 1126

With pandas module:

import pandas as pd
out_d = (pd.DataFrame(main_list).set_index([0,1])[2].astype(int).groupby(level=[0,1]).sum()).to_dict()
out_d

Out[1]:
{('[email protected]', 'Administration'): 100,
 ('[email protected]', 'Development'): 170,
 ('[email protected]', 'Testing'): 30,
 ('[email protected]', 'Administration'): 95,
 ('[email protected]', 'Development'): 400,
 ('[email protected]', 'Testing'): 200}

#for list
[[u[0],u[1],v] for u,v in out_d.items()]

Out[2]:
[['[email protected]', 'Administration', 100],
 ['[email protected]', 'Development', 170],
 ['[email protected]', 'Testing', 30],
 ['[email protected]', 'Administration', 95],
 ['[email protected]', 'Development', 400],
 ['[email protected]', 'Testing', 200]]

Upvotes: 0

Mykola Zotko
Mykola Zotko

Reputation: 17911

You can use the function groupby():

from itertools import groupby
from operator import itemgetter

iget = itemgetter(0, 1)
[[*k, sum(int(i[2]) for i in g)] for k, g in groupby(sorted(main_list), key=iget)]

Result:

[['[email protected]', 'Administration', 100],
 ['[email protected]', 'Development', 170],
 ['[email protected]', 'Testing', 30],
 ['[email protected]', 'Administration', 95],
 ['[email protected]', 'Development', 400],
 ['[email protected]', 'Testing', 200]]

Upvotes: 1

Rakesh
Rakesh

Reputation: 82815

Using collections.defaultdict

Ex:

from collections import defaultdict


main_list = [["[email protected]", "Administration", "100"],
             ["[email protected]", "Testing", "30"],
             ["[email protected]", "Development", "45"],
             ["[email protected]", "Development", "90"],
             ["[email protected]", "Development", "35"],
             ["[email protected]", "Development", "400"],
             ["[email protected]", "Administration", "95"],
             ["[email protected]", "Testing", "200"]]
result = defaultdict(int)
for k, v, n in main_list:
    result[(k, v)] += int(n)
result = [list(k) + [v] for k, v in result.items()]
print(result)

Output:

[['[email protected]', 'Administration', 100],
 ['[email protected]', 'Testing', 30],
 ['[email protected]', 'Development', 170],
 ['[email protected]', 'Development', 400],
 ['[email protected]', 'Administration', 95],
 ['[email protected]', 'Testing', 200]]

Upvotes: 4

Related Questions