BioGeek
BioGeek

Reputation: 22827

Counting most common list in list of (list of dictionaries)

I have the following data structure:

data = [[{'Posit': '0', 'R': '0', 'B': '0', 'G': '255'}, {'Posit': '1000', 'R': '255', 'B': '0', 'G': '0'}],
        [{'Posit': '0', 'R': '0', 'B': '0', 'G': '255'}, {'Posit': '1000', 'R': '255', 'B': '0', 'G': '0'}],
        [{'Posit': '0', 'R': '0', 'B': '0', 'G': '255'}, {'Posit': '1000', 'R': '255', 'B': '0', 'G': '0'}],
        [{'Posit': '0', 'R': '255', 'B': '0', 'G': '255'}, {'Posit': '1000', 'R': '0', 'B': '255', 'G': '0'}],
        [{'Posit': '0', 'R': '0', 'B': '0', 'G': '255'}, {'Posit': '1000', 'R': '255', 'B': '0', 'G': '0'}],
        [{'Posit': '0', 'R': '0', 'B': '0', 'G': '255'}, {'Posit': '1000', 'R': '255', 'B': '0', 'G': '0'}],
        [{'Posit': '0', 'R': '0', 'B': '0', 'G': '255'}, {'Posit': '1000', 'R': '255', 'B': '0', 'G': '0'}],
        [{'Posit': '0', 'R': '0', 'B': '0', 'G': '255'}, {'Posit': '1000', 'R': '255', 'B': '0', 'G': '0'}]]

I want to find the most common list of dictionaries in the above data structure.

My first idea was to use the most_common function from collections.Counter, but

from collections import Counter
c = Counter()
for point in data:
    c[point] += 1

fails with a TypeError because lists are unhashable.

My next idea was to convert the list to a tuple because tuples are immutable

from collections import Counter
c = Counter()
for point in data:
    c[tuple(point)] += 1

but then I got a TypeError saying that dictionaries are also unhashable.

So what is a Pythonic way to accomplish what I want?

Upvotes: 0

Views: 178

Answers (2)

Karl Knechtel
Karl Knechtel

Reputation: 61498

from collections import namedtuple, Counter
# You can probably think of a better name than this
datum = namedtuple('datum', 'Posit R B G')
Counter(tuple(datum(**d) for d in a) for a in data).most_common()
# You might actually want to make the conversion permanent;
# the data is possibly easier to work with that way given the
# fixed key structure, and it should save memory too

Upvotes: 1

eumiro
eumiro

Reputation: 212835

You can use Counter, but you will have to convert lists to tuples and dictionaries to sorted tuples of tuples (sorted tuples of tuples key-value to be able to compare two dictionaries).

>>> Counter(tuple(tuple(sorted(d.items())) for d in a) for a in data).most_common()

[(((('B', '0'), ('G', '255'), ('Posit', '0'), ('R', '0')),
   (('B', '0'), ('G', '0'), ('Posit', '1000'), ('R', '255'))),
  7),
 (((('B', '0'), ('G', '255'), ('Posit', '0'), ('R', '255')),
   (('B', '255'), ('G', '0'), ('Posit', '1000'), ('R', '0'))),
  1)]

As @Marcin correctly comments tuple(sorted(d.items())) can be replaced by the more appropriate frozenset(d.items()):

>>> Counter(tuple(frozenset(d.items()) for d in a) for a in data).most_common()
[((frozenset([('Posit', '0'), ('B', '0'), ('G', '255'), ('R', '0')]),
   frozenset([('R', '255'), ('G', '0'), ('B', '0'), ('Posit', '1000')])),
  7),
 ((frozenset([('Posit', '0'), ('R', '255'), ('B', '0'), ('G', '255')]),
   frozenset([('B', '255'), ('G', '0'), ('R', '0'), ('Posit', '1000')])),
  1)]

Upvotes: 1

Related Questions