Segmenting a list of lists in Python

Question

I have a list of lists all of the same length. I would like to segment the first list into contiguous runs of a given value. I would then like to segment the remaining lists to match the segments generated from the first list.

For example:

Given value: 2

Given list of lists: [[0,0,2,2,2,1,1,1,2,3], [1,2,3,4,5,6,7,8,9,10], [1,1,1,1,1,1,1,1,1,1]

Return: [ [[2,2,2],[2]], [[3,4,5],[9]], [[1,1,1],[1]] ]

The closest I have gotten is to get the indices by:

>>> import itertools
>>> import operator
>>> x = 2
>>> L = [[0,0,2,2,2,1,1,1,2,3],[1,2,3,4,5,6,7,8,9,10],[1,1,1,1,1,1,1,1,1,1]]
>>> I = [[i for i,value in it] for key,it in itertools.groupby(enumerate(L[0]), key=operator.itemgetter(1)) if key == x]
>>> print I
[[2, 3, 4], [8]]

This code was modified from another question on this site.

I would like to find the most efficient way possible, since these lists may be very long.

EDIT:

Maybe if I place the lists one on top of each other it might be clearer:

[[0,0,[2,2,2],1,1,1,[2],3], -> [2,2,2],[2]
 [1,2,[3,4,5],6,7,8,[9],10],-> [3,4,5],[9]
 [1,1,[1,1,1],1,1,1,[1],1]] -> [1,1,1],[1]

blhsing · Accepted Answer

You can use groupby to create a list of groups in the form of a tuple of starting index and length of the group, and use this list to extract the values from each sub-list:

from itertools import groupby
from operator import itemgetter

def match(L, x):
    groups = [(next(g)[0], sum(1 for _ in g) + 1)
        for k, g in groupby(enumerate(L[0]), key=itemgetter(1)) if k == x]
    return [[lst[i: i + length] for i, length in groups] for lst in L]

so that:

match([[0,0,2,2,2,1,1,1,2,3], [1,2,3,4,5,6,7,8,9,10], [1,1,1,1,1,1,1,1,1,1]], 2)

returns:

[[[2, 2, 2], [2]], [[3, 4, 5], [9]], [[1, 1, 1], [1]]]

Segmenting a list of lists in Python

Answers (2)

Related Questions