PanDe
PanDe

Reputation: 974

Python: get all items from a list between 2 known string or Indexes

I am trying to find all the items between two given indexes. for eg: i have list that looks like this:

mylist = ['ABC', 'COMMENT', 'YES', 'YES', 'NO', '123', 'COMMENT','GONOW','MAKE','COMMENT', 'YES','COMMENT']

i want the output to be show as below: Note: Below outputs are the index value between two 'COMMENT'.

First output  :  'YES', 'YES', 'NO', '123'
second output :  'GONOW','MAKE'
third output  :  'YES'

I have two thoughts to handle this situation: 1) if i know the search_string as 'COMMENT' then i should be able find everything between two known strings, something like this :

string = 'COMMENT'
find_values = mylist[findfirst(comment)-findsecond(comment)]
find_values = mylist[findsecond(comment)-findthird(comment)]

2) if i know the index of all 'COMMENT' then i should be able find_all between two known indexs, something like this :

idx1_comment = 1
idx2_comment = 6
idx3_comment = 9

print mylist(range(2-5))
print mylist(range(6-8))

Any ideas?

Thanks...

Also, i have another request. For #1 option, if i have list wherein i have a lot of items between strings 'comment' and 'border' then what be the way for that as well?

Btw, i tried following this article, but with no benefit. Python: get items from list between 2 known items

Upvotes: 0

Views: 5570

Answers (3)

Patrick Artner
Patrick Artner

Reputation: 51653

You could leverage islice on itertools:

from itertools import islice

mylist = ['ABC', 'COMMENT', 'YES', 'YES', 'NO', '123', 'COMMENT','GONOW','MAKE','COMMENT', 'YES','COMMENT']

# get indexes of all 'COMMENT'
idx = [i for i,v in enumerate(mylist) if v in [ 'COMMENT']]
print(idx) 

# calculate tuples of data we are interested in: 
#   ( commentIdx+1 , next commentIdx ) for all found ones
idx2 = [ (idx[i]+1,idx[i+1]) for i in range(len(idx)-1)]
print(idx2)

# slice the data out of our list and create lists of the slices
result = [ list(islice(mylist,a,b)) for a,b in idx2]

print(result) 

Output:

[1, 6, 9, 11]  # indexes of 'COMMENT'

[(2, 6), (7, 9), (10, 11)]  # Commentindex +1 + the next one as tuple

[['YES', 'YES', 'NO', '123'], ['GONOW', 'MAKE'], ['YES']]

You could also skip idx2:

result = [ list(islice(mylist,a,b)) for a,b in ((idx[i]+1,idx[i+1]) for i in range(len(idx)-1))]

As for your 2nd question - which should probably be solved by yourself or posted as seperate question:

It depends on the data - do they intermix (C = COMMENT, B = BORDER) ?

[ C, 1, 2 , C , 3 , B , 4 , B, 5,  C ]

Best probably to try it yourself or post a new question with sample data and wanted output. The above one could be

[ [1,2],[3],[4] ] or [ [1,2],[3],[3,4,5],[4,5],[5] or something in between - the first uses strict C-C and B-B and no C-B or B-C matches. The latter allowes C-B and B-C as well as C-......-C ignoring/removing B's in between

Upvotes: 0

physicalattraction
physicalattraction

Reputation: 6858

If all you want is to filter out the strings between two predefined strings, this code will do:

def filter_output_between_strings(input: [str], separator: str):
    separator_indexes = [index for index, word in enumerate(input) if word == separator]
    for example_index, (start, end) in enumerate(zip(separator_indexes[:-1], separator_indexes[1:]), start=1):
        print('Example: {}: {}'.format(example_index, input[start + 1:end]))


if __name__ == '__main__':
    input = ['ABC', 'COMMENT', 'YES', 'YES', 'NO', '123', 'COMMENT', 'GONOW', 'MAKE', 'COMMENT', 'YES', 'COMMENT']
    filter_output_between_strings(input, 'COMMENT')

Output:

Example: 1: ['YES', 'YES', 'NO', '123']
Example: 2: ['GONOW', 'MAKE']
Example: 3: ['YES']

Upvotes: 1

Maurice Meyer
Maurice Meyer

Reputation: 18106

Why not simply iterate the list once:

result = []
sublist = []

separators = ('COMMENT', 'BORDER')
mylist = ['ABC', 'COMMENT', 'YES', 'YES', 'NO', '123', 'COMMENT','GONOW','MAKE','COMMENT', 'YES','COMMENT', 'BORDER', 'FOO', '123']

for x in mylist:
    if x in separators:
        if sublist:
            result.append(sublist)
        sublist = []
    else:
        sublist.append(x)
result.append(sublist)

print (result)

Returns:

[['ABC'], ['YES', 'YES', 'NO', '123'], ['GONOW', 'MAKE'], ['YES'], ['FOO', '123']]

Upvotes: 2

Related Questions