user3758443
user3758443

Reputation: 149

Python: Adding list values to each other in a list of lists

I have a list of lists like this:

[[12411.0, 31937.0, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.15, 0.1, 0.15, 0.2, 0.1, 0.15, 0.15, 0.15, 0.15], [12411.0, 31937.0, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]], etc.]

If the first and second element of an inner list is the same as the first and second element of another inner list (like the example above), I want to create a function that adds the remaining values and merges them into one list. The example output would be like this:

[12411.0, 31937, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.25, 0.2, 0.25, 0.3, 0.2, 0.25, 0.25, 0.25, 0.25]

I'm having trouble how to tell Python to initially recognize and compare the two elements of the list before merging them together. Here is my best attempt so far:

def group(A):
for i in range(len(A)):
    for j in range(len(A[i])):
        if A[i][0:1] == A[i: ][0:1]:
            return [A[i][0], A[i][1], sum(A[i][j+2], A[i: ][j+2])]

I get an index error, I believe, because of the A[i: ] and A[i: ][j+2] parts of the code. I don't know how to phrase it though in Python to tell the function to add any other lines that meet the criteria.

Upvotes: 0

Views: 2488

Answers (4)

Abhijit
Abhijit

Reputation: 63737

If you are fond of itertools with a little effort, this can easily be solved by playing around with groupby, islice, izip, imap and chain.

And off course you should also remember to use operator.itemgetter

Implementation

# Create a group of lists where the key (the first two elements of the lists) matches
groups = groupby(sorted(l, key = itemgetter(0, 1)), key = itemgetter(0, 1))
# zip the lists and then chop of the first two elements. Sum the elements of the resultant list
# Remember to add the newly accumulated list with the first two elements
groups_sum = ([k, imap(sum, islice(izip(*g), 2, None))] for k, g in groups )
# Reformat the final list to match the output format
[list(chain.from_iterable(elem)) for elem in groups_sum]

Implementation (If you are a fan of single liner)

[list(chain.from_iterable([k, imap(sum, islice(izip(*g), 2, None))]))
  for k, g in groupby(sorted(l, key = itemgetter(0, 1)), key = itemgetter(0, 1))]

Sample Input

l = [[10,20,0.1,0.2,0.3,0.4],
     [11,22,0.1,0.2,0.3,0.4],
     [10,20,0.1,0.2,0.3,0.4],
     [11,22,0.1,0.2,0.3,0.4],
     [20,30,0.1,0.2,0.3,0.4],
     [10,20,0.1,0.2,0.3,0.4]]

Sample Output

[[10, 20, 0.3, 0.6, 0.9, 1.2],
 [11, 22, 0.2, 0.4, 0.6, 0.8],
 [20, 30, 0.1, 0.2, 0.3, 0.4]]

Dissection

groups = groupby(sorted(l, key = itemgetter(0, 1)), key = itemgetter(0, 1))
# After grouping, similar lists gets clustered together
[((10, 20),
  [[10, 20, 0.1, 0.2, 0.3, 0.4],
   [10, 20, 0.1, 0.2, 0.3, 0.4],
   [10, 20, 0.1, 0.2, 0.3, 0.4]]),
 ((11, 22), [[11, 22, 0.1, 0.2, 0.3, 0.4], [11, 22, 0.1, 0.2, 0.3, 0.4]]),
 ((20, 30), [[20, 30, 0.1, 0.2, 0.3, 0.4]])]

groups_sum = ([k, imap(sum, islice(izip(*g), 2, None))] for k, g in groups )
# Each group is accumulated from the second element onwards
[[(10, 20), [0.3, 0.6, 0.9, 1.2]],
 [(11, 22), [0.2, 0.4, 0.6, 0.8]],
 [(20, 30), [0.1, 0.2, 0.3, 0.4]]]

[list(chain.from_iterable(elem)) for elem in groups_sum]
# Now its just a matter of representing in the output format
[[10, 20, 0.3, 0.6, 0.9, 1.2],
 [11, 22, 0.2, 0.4, 0.6, 0.8],
 [20, 30, 0.1, 0.2, 0.3, 0.4]]

Upvotes: 1

Gabriel
Gabriel

Reputation: 10884

This is a function that will take a list of lists A and check internal list i and j using your criteria. It will then either return the summed list you want or None if the first two elements don't match.

def check_internal_ij(A,i,j):
    """ checks internal list i against internal list j """ 
    if A[i][0:2] == A[j][0:2]:
        new = [x+y for x,y in zip( A[i], A[j] )]
        new[0:2] = A[i][0:2]
        return new
    else:
        return None

Then you can run the function over all combinations of internal lists you want to check.

Upvotes: 1

shaktimaan
shaktimaan

Reputation: 12092

This is one way to do it:

>>> a_list = [[12411.0, 31937.0, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.15, 0.1, 0.15, 0.2, 0.1, 0.15, 0.15, 0.15, 0.15], [12411.0, 31937.0, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]]
>>> result = [a + b for a, b in zip(*a_list)]
>>> result[:2] = a_list[0][:2]
>>> result
[12411.0, 31937.0, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.25, 0.2, 0.25, 0.30000000000000004, 0.2, 0.25, 0.25, 0.25, 0.25]

This works by blindly adding up corresponding elements in all the sub-lists by doing:

[a + b for a, b in zip(*a_list)]

And then rewriting the first two elements of the result which according to the question does not change, by doing:

result[:2] = a_list[0][:2]

It is not evident from your question, as to what should the behavior be if the first two elements of the sub lists do not match. But the following snippet will help you check if the first two elements of the sub lists match. Lets assume a_list contains sublists whose first two elements do not match:

>>> a_list = [[12411.0, 31937.0, 0.1, 0.1], [12411.3, 31937.0, 0.1, 0.1]]

then, this condition:

all([True if list(a)[1:] == list(a)[:-1] else False for a in list(zip(*a_list))[:2]])

will return False. True otherwise. The code extracts the first elements and second elements of all the sub lists and then checks if they are equal.

You can include the above check in your code and modify your code accordingly for the expected behavior.

To sum it up:

a_list = [[12411.0, 31937.0, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.15, 0.1, 0.15, 0.2, 0.1, 0.15, 0.15, 0.15, 0.15], [12411.0, 31937.0, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]]
check = all([True if list(a)[1:] == list(a)[:-1] else False for a in list(zip(*a_list))[:2]])
result = []
if check:
    result = [a + b for a, b in zip(*a_list)]
    result[:2] = a_list[0][:2]
else:
    # whatever the behavior should be.

Upvotes: 3

dano
dano

Reputation: 94881

Here's a function that will merge all sublists where the first two entries match. It also handles cases where the sub-lists are not the same length:

from itertools import izip_longest

l = [[1,3,4,5,6], [1,3,2,2,2], [2,3,5,6,6], [1,1,1,1,1], [1,1,2,2,2], [1,3,6,2,1,1,2]]
l2 = [[12411.0, 31937.0, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.15, 0.1, 0.15, 0.2, 0.1,  0.15, 0.15, 0.15, 0.15], [12411.0, 31937.0, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]]

def merge(l):
    d = {}
    for ent in l:
        key = tuple(ent[0:2])
        merged = d.get(key, None)
        if merged is None:
            d[key] = ent
        else:
            merged[2:] = [a+b for a,b in izip_longest(merged[2:], ent[2:], fillvalue=0)]
    return d.values()

print merge(l)
print merge(l2)

Output:

[[1, 3, 12, 9, 9, 1, 2], [2, 3, 5, 6, 6], [1, 1, 3, 3, 3]]
[[12411.0, 31937.0, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.25, 0.2, 0.25, 0.30000000000000004, 0.2, 0.25, 0.25, 0.25, 0.25]]

It's implemented by maintaining a dict where the keys are the first two entries of a sub-list (stored as a tuple). As we iterate over the sublists, we check to see if there's an entry in the dict. If there isn't, we store the current sublist in the dict. If there already is an entry, we add up all their values from index 2 onward, and update the dict. Once we're one iterating, we just return all the values from the dict.

Upvotes: 3

Related Questions