osman
osman

Reputation: 43

While removing duplicates from one list, getting the sum of values from another list

I could not find if that was asked before.

I have two lists

string_list = ["o1", "o2", "o1", "o1", "o3", "o2"]

value_list = [5, 6, 7, 8, 14, 47]

I want to remove duplicates from string_list while getting the sum of all these strings values from the value_list.

Here what I want to see is:

string_result = ["o1", "o2", "o3"]

value_result = [20, 53, 14] 

Thanks for your help.

Upvotes: 0

Views: 94

Answers (7)

Eugene Yarmash
Eugene Yarmash

Reputation: 149823

You could use a defaultdict:

from collections import defaultdict

string_list = ["o1", "o2", "o1", "o1", "o3", "o2"]
value_list = [5, 6, 7, 8, 14, 47]

d = defaultdict(int)
for s, v in zip(string_list, value_list):
    d[s] += v

value_result = [d[s] for s in sorted(d)]  # [20, 53, 14]

Upvotes: 0

kederrac
kederrac

Reputation: 17322

if the order is not important you can have:

from itertools import groupby

string_list = ["o1", "o2", "o1", "o1", "o3", "o2"]
value_list = [5, 6, 7, 8, 14, 47]

string_list, value_list = [list(o) for o in zip(*{k : sum(e[1] for e in v) for k, v in groupby(sorted(zip(string_list, value_list)), key=lambda x: x[0])}.items())]

Upvotes: 0

Maciek Runo
Maciek Runo

Reputation: 1

string_list = ["o1", "o2", "o1", "o1", "o3", "o2"]

value_list = [5, 6, 7, 8, 14, 47]

result = {k:0 for k in string_list}
for k, v in zip(string_list, value_list):
    result[k] += v

print(result)

Upvotes: 0

bexi
bexi

Reputation: 1216

Just two lines using numpy and list comprehension. First we create the given data:

string_list = ["o1", "o2", "o1", "o1", "o3", "o2"]
value_list = [5, 6, 7, 8, 14, 47]

And now the solution, which uses boolean indexing in numpy arrays:

import numpy as np # ok, three lines if you count the import :)

string_result = [val for val in set(string_list)]
value_result = [sum(np.array(value_list)[(np.array(string_list) == val)]) for val in string_result]

Upvotes: 0

Ron Serruya
Ron Serruya

Reputation: 4426

string_list = ["o1", "o2", "o1", "o1", "o3", "o2"]
value_list = [5, 6, 7, 8, 14, 47]

no_dups = list(set(string_list))
values = [0] * len(no_dups)

for string, number in zip(string_list, value_list):
    values[no_dups.index(string)] += number 

print(no_dups, values)

output: ['o1', 'o2', 'o3'] [20, 53, 14]

Upvotes: 0

Robert Price
Robert Price

Reputation: 641

You can collect the totals in a dict, but using an OrderedDict will allow you to preserve the elements of string_list in order of first appearance.

from collections import OrderedDict

string_list = ["o1", "o2", "o1", "o1", "o3", "o2"]
value_list = [5, 6, 7, 8, 14, 47]

dict_result = OrderedDict()
for (string, value) in zip(string_list, value_list):
    dict_result[string] = dict_result.get(string, 0) + value

string_result = list(dict_result.keys())
value_result = list(dict_result.values())

Upvotes: 0

Kostas Charitidis
Kostas Charitidis

Reputation: 3103

string_list = ["o1", "o2", "o1", "o1", "o3", "o2"]
value_list = [5, 6, 7, 8, 14, 47]

dct = {}
for idx, i in enumerate(string_list):
    if dct.get(i):
        dct[i] += value_list[idx]
    else:
        dct[i] = value_list[idx]

Upvotes: 1

Related Questions