spatialaustin
spatialaustin

Reputation: 706

Counting and appending sublist elements in python

I'm trying to count the number of unique instances of a sublist element, then write each unique element to a new list, with the number of instances appended to the sublist. Each sublist in list_1 will have only two elements and order does not matter.

so:

list_1 = [[a, b], [a, c], [a, c], [a, c], [b, e], [d, q], [d, q]]

becomes:

new_list =  [[a, b, 1], [a, c, 3], [b, e, 1], [d, q, 2]]

I'm thinking that i'll need to use sets, but I appreciate anyone pointing me in the right direction.

Upvotes: 2

Views: 1567

Answers (3)

Martijn Pieters
Martijn Pieters

Reputation: 1122332

You want to look at collections.Counter(); Counter objects are multi-sets (also known as bags); they map keys to their counts.

You will have to turn your sublists into tuples to be usable as keys though:

from collections import Counter

counts = Counter(tuple(e) for e in list_1)

new_list = [list(e) + [count] for e, count in counts.most_common()]

which gives you a new_list sorted by counts (descending):

>>> from collections import Counter
>>> list_1 = [['a', 'b'], ['a', 'c'], ['a', 'c'], ['a', 'c'], ['b', 'e'], ['d', 'q'], ['d', 'q']]
>>> counts = Counter(tuple(e) for e in list_1)
>>> [list(e) + [count] for e, count in counts.most_common()]
[['a', 'c', 3], ['d', 'q', 2], ['a', 'b', 1], ['b', 'e', 1]]

If your occurrences are always consecutive, then you could also use itertools.groupby():

from itertools import groupby

def counted_groups(it):
    for entry, group in groupby(it, key=lambda x: x):
        yield entry + [sum(1 for _ in group)]

new_list = [entry for entry in counted_groups(list_1)]

I used a separate generator function here, you can inline the loop into the list comprehension though.

This gives:

>>> from itertools import groupby
>>> def counted_groups(it):
...     for entry, group in groupby(it, key=lambda x: x):
...         yield entry + [sum(1 for _ in group)]
... 
>>> [entry for entry in counted_groups(list_1)]
[['a', 'b', 1], ['a', 'c', 3], ['b', 'e', 1], ['d', 'q', 2]]

and retains the original ordering.

Upvotes: 6

jfs
jfs

Reputation: 414335

If identical sublists are consecutive:

from itertools import groupby

new_list = [sublist + [sum(1 for _ in g)] for sublist, g in groupby(list_1)]
# -> [['a', 'b', 1], ['a', 'c', 3], ['b', 'e', 1], ['d', 'q', 2]]

Upvotes: 1

Pedro del Sol
Pedro del Sol

Reputation: 2841

a bit of a 'round the houses' solution

list_1 = [['a', 'b'], ['a', 'c'], ['a', 'c'], ['a', 'c'], ['b', 'e'], ['d', 'q'], ['d', 'q']]
new_dict={}
new_list=[]
for l in list_1:
    if tuple(l) in new_dict:
        new_dict[tuple(l)] += 1
    else:
        new_dict[tuple(l)] = 1
for key in new_dict:
    entry = list(key)
    entry.append(new_dict[key])
    new_list.append(entry)
print new_list

Upvotes: 0

Related Questions