getting list of indices of each value of list in a pythonic way

Question

I have a list data of values, and I want to return a dictionary mapping each value of data to a list of the indices where this value appears.

This can be done using this code:

data = np.array(data)
{val: list(np.where(data==val)[0]) for val in data}

but this runs in O(n^2), and this is too slow for long lists. Can an O(n) solution be coded using a "pythonic" syntax? (it can be done with creating an empty list and updating it in a loop, but I understand this is not recommended in python.)

mhawke · Accepted Answer

You can use a defaultdict of lists to achieve this in O(n):

from collections import defaultdict

d = defaultdict(list)
for idx, item in enumerate(data):
    d[item].append(idx)

For example, if data contains the string 'abcabccbazzzqa':

d = defaultdict(list)
for idx, item in enumerate('abcabccbazzzqa'):
    d[item].append(idx)

>>> d
defaultdict(, {'a': [0, 3, 8, 13], 'q': [12], 'c': [2, 5, 6], 'b': [1, 4, 7], 'z': [9, 10, 11]})
>>> d['a']
[0, 3, 8, 13]

getting list of indices of each value of list in a pythonic way

Answers (2)

Related Questions