Reputation: 18754
I have a list containing various string values. I want to split the list whenever I see WORD
. The result will be a list of lists (which will be the sublists of original list) containing exactly one instance of the WORD
I can do this using a loop but is there a more pythonic way to do achieve this ?
Example = ['A', 'WORD', 'B' , 'C' , 'WORD' , 'D']
result = [['A'], ['WORD','B','C'],['WORD','D']]
This is what I have tried but it actually does not achieve what I want since it will put WORD
in a different list that it should be in:
def split_excel_cells(delimiter, cell_data):
result = []
temp = []
for cell in cell_data:
if cell == delimiter:
temp.append(cell)
result.append(temp)
temp = []
else:
temp.append(cell)
return result
Upvotes: 31
Views: 34031
Reputation: 20679
itertools
:izip
is specific to python 2.7. Replace izip
with zip
to work in python 3from itertools import izip, chain
example = ['A', 'WORD', 'B' , 'C' , 'WORD' , 'D']
indices = [i for i,x in enumerate(example) if x=="WORD"]
pairs = izip(chain([0], indices), chain(indices, [None]))
result = [example[i:j] for i, j in pairs]
Upvotes: 4
Reputation: 44525
Given
import more_itertools as mit
iterable = ["A", "WORD", "B" , "C" , "WORD" , "D"]
pred = lambda x: x == "WORD"
Code
list(mit.split_before(iterable, pred))
# [['A'], ['WORD', 'B', 'C'], ['WORD', 'D']]
more_itertools
is a third-party library installable via > pip install more_itertools
.
See also split_at
and split_after
.
Upvotes: 3
Reputation: 214969
import itertools
lst = ['A', 'WORD', 'B' , 'C' , 'WORD' , 'D']
w = 'WORD'
spl = [list(y) for x, y in itertools.groupby(lst, lambda z: z == w) if not x]
this creates a splitted list without delimiters, which looks more logical to me:
[['A'], ['B', 'C'], ['D']]
If you insist on delimiters to be included, this should do the trick:
spl = [[]]
for x, y in itertools.groupby(lst, lambda z: z == w):
if x: spl.append([])
spl[-1].extend(y)
Upvotes: 42
Reputation: 500475
I would use a generator:
def group(seq, sep):
g = []
for el in seq:
if el == sep:
yield g
g = []
g.append(el)
yield g
ex = ['A', 'WORD', 'B' , 'C' , 'WORD' , 'D']
result = list(group(ex, 'WORD'))
print(result)
This prints
[['A'], ['WORD', 'B', 'C'], ['WORD', 'D']]
The code accepts any iterable, and produces an iterable (which you don't have to flatten into a list if you don't want to).
Upvotes: 23