efficient way of grouping dictionary by fist item of key-tuple

Question

I have a dictionary where keys are tuple of indices and values are int. I am using it for sparse representation of a matrix.

In [347]: M
Out[347]:
{(1, 0, 0): 2,
 (1, 1, 2): 3,
 (1, 1, 3): 1,
 (1, 2, 4): 5,
 (2, 1, 3): 4,
 (2, 2, 2): 1}

Now I need to do some calculation on different sub matrices and for that I need to subtract each matrix. I can do it in two different ways:

Method #1: create a nested dictionary which group M based on first key item:

In [190]: for k,v in M.iteritems():
    newM[k[0]][(k[1],k[2])]=v

result:

1 {(1, 2): 3, (0, 0): 2, (2, 4): 5, (1, 3): 1}
2 {(1, 3): 4, (2, 2): 1}

Now I can easily access each sub matrix by using the key. However I am dealing with big matrices and it is not memory sufficient.

Method #2: extract each sub matrix only when I want to do the computation:

for i in xrange(number_of_submatrices):
    sub_M = {(idx[1],idx[2])=val for idx,val in M.iteritems() if idx[0]==i }  
    apply_computation(sub_M)

This method is slow since I need to iterate through M every time and check the index of sub matrix

I also tried to use the combination of itertools.groupby and operator.itemgetter but didn't succeed to make it work.

Any idea for a better solution in general? Or faster solutions for any of two methods?

falsetru · Accepted Answer

To use itertools.groupby, you need to sort the input first:

>>> M = {(1, 0, 0): 2,
...      (1, 1, 2): 3,
...      (1, 1, 3): 1,
...      (1, 2, 4): 5,
...      (2, 1, 3): 4,
...      (2, 2, 2): 1}
>>>
>>> import itertools
>>> newM = {
...     key: {k[1:]: M[k] for k in grp}
...     for key, grp in
...     itertools.groupby(sorted(M), key=lambda k: k[0])
... }  # nested dict comprehension
>>> newM
{1: {(1, 2): 3, (0, 0): 2, (2, 4): 5, (1, 3): 1},
 2: {(1, 3): 4, (2, 2): 1}}

efficient way of grouping dictionary by fist item of key-tuple

Answers (1)

Related Questions