pdubois
pdubois

Reputation: 7800

Getting indices of sequential chunks of a list

I have these lists:

l1 = ["foo","bar","x","y","z","x","y","z","x","y","z"]
l2 = ["foo","bar","w","x","y","z","w","x","y","z","w","x","y","z"]
l3 = ["foo","bar","y","z","y","z","y","z"]

For each of the list above I'd like to get the indices of sequential chunks from 3rd entry onwards. Yield:

l1_indices = [[2,3,4],[5,6,7],[8,9,10]]
l2_indices = [[2,3,4,5],[6,7,8,9],[10,11,12,13]]
l3_indices = [[2,3],[4,5],[6,7]]

To clarify further, I got l1_indices the following way:

  ["foo","bar",   "x","y","z",  "x","y","z",   "x","y","z"]
     0     1       2   3   4     5  6    7      8   9   10   <-- indices id
               ---> onwards
               ---> always in 3 chunks

What's the way to do it in Python?

I tried this but no avail:

In [8]: import itertools as IT
In [9]: import operator
In [11]: [list(zip(*g))[::-1]  for k, g in IT.groupby(enumerate(l1[2:]), operator.itemgetter(1))]
Out[11]:
[[('x',), (0,)],
 [('y',), (1,)],
 [('z',), (2,)],
 [('x',), (3,)],
 [('y',), (4,)],
 [('z',), (5,)],
 [('x',), (6,)],
 [('y',), (7,)],
 [('z',), (8,)]]

Upvotes: 1

Views: 132

Answers (2)

Kasravnd
Kasravnd

Reputation: 107347

As a more general answer first of all you can find a sublist of your list that contain elements with length more than 1 , then based on its length and length of its set you can grub the desire indices :

>>> l =['foo', 'bar', 'w', 'x', 'y', 'z', 'w', 'x', 'y', 'z', 'w', 'x', 'y', 'z']

>>> s=[i for i in l if l.count(i)>2]
>>> len_part=len(l)-len(s)
>>> len_set=len(set(s))

>>> [range(i,i+l_s) for i in range(len_part,len(l),len_set)]
[[2, 3, 4, 5], [6, 7, 8, 9], [10, 11, 12, 13]]

Upvotes: 1

taskinoor
taskinoor

Reputation: 46037

If sequential elements are always in three chunks and always starts from third item then you can simply divide the remaining elements by three and generate indices list.

>>> def get_indices(l):
...     last = len(l) - 2
...     diff = last / 3                                   
...     return [range(i, i + diff) for i in range(2, last, diff)]
... 
>>> get_indices(l1)
[[2, 3, 4], [5, 6, 7], [8, 9, 10]]
>>> get_indices(l2)
[[2, 3, 4, 5], [6, 7, 8, 9], [10, 11, 12, 13]]
>>> get_indices(l3)
[[2, 3], [4, 5]]

Upvotes: 2

Related Questions