Shadi
Shadi

Reputation: 41

implementing counter in nested dictionaries

I have a .csv file with 3 columns, let' say a,b,c with c representing time and can have values from 00-24.

I want to go through this file and extract unique a,b,c and count the number of occurrences of a particular c. For example, if the file looks like this:

a1 b1 c1

a1 b1 c1

a1 b1 c1

a1 b1 c2

a1 b1 c2

a1 b2 c1

a1 b2 c1

a2 b1 c1

a2 b1 c2

I want to create something like this:

{a1:{b1:{c1:3, c2:2},b2:{c1:2}},a2:{b1:{c1:1,c2:1}}}

But I'm not sure if a nested dictionary is a good choice. In case it is, I have difficulty implementing the "counter" part.

Upvotes: 0

Views: 152

Answers (1)

Jared
Jared

Reputation: 567

You can still use a Counter to do the counting:

rows = [
    ('a1', 'b1', 'c1'),
    ('a1', 'b1', 'c1'),
    ('a1', 'b1', 'c1'),
    ('a1', 'b1', 'c2'),
    ('a1', 'b1', 'c2'),
    ('a1', 'b2', 'c1'),
    ('a1', 'b2', 'c1'),
    ('a2', 'b1', 'c1'),
    ('a2', 'b1', 'c2'),
]

from collections import Counter

counts = Counter(rows)

As far as changing the data structure to a nested dictionary, you can do this with a plain dictionary using setdefault, or you can implement an "autovivificious" dictionary and use that:

class AutoViv(dict):
    def __missing__(self, key):
        value = self[key] = type(self)()
        return value

nested = AutoViv()
for row, count in counts.iteritems():
    nested[row[0]][row[1]][row[2]] = count

This matches your desired result:

>>> nested
{'a1': {'b1': {'c2': 2, 'c1': 3}, 'b2': {'c1': 2}}, 'a2': {'b1': {'c2': 1, 'c1': 1}}}

Upvotes: 1

Related Questions