triplecute
triplecute

Reputation: 35

Find Strings Located Between Specific Strings in List Python

I'm writing code that's pulling in data from a website and it's printing out all the text between certain tags. I am storing the result into a list every time the code pulls data from a tag so I have a list looking something like

Warning
Not
News
Legends
Name1
Name2
Name3
Pickle
Stop
Hello

I want to look into this list of strings and have code that'll find the keywords legends and pickle and print whatever strings are between them.

To elaborate in a further activity, I may create a whole list of all possible legend names and then, if they occur whenever I generate my list, to print those out that reoccur. Any insight into any of these questions?

Upvotes: 0

Views: 586

Answers (5)

thomas
thomas

Reputation: 449

numpys function where gives you all occurances of a given item. So first make the lsit a numpy array

my_array = numpy.array(["Warning","Not","News","Legends","Name1","Name2","Name3","Pickle","Stop","Hello","Legends","Name1","Name2","Name3","Pickle",])

From here on you can use methods of numpy:

legends = np.where(my_array == "Legends")
pickle = np.where(my_array == "Pickle")

concatinating for easier looping

stack = np.concatenate([legends, pickle], axis=0)

look for the values between legends and pickle

np.concatenate([my_list[stack[0, i] + 1:stack[1, i]] for i in range(stack.shape[0])] )

The result in my case is:

array(['Name1', 'Name2', 'Name3', 'Name1', 'Name2'], dtype='<U7')

Upvotes: 1

Epsi95
Epsi95

Reputation: 9047

You can use list.index() to get the index of the first occurance of legends and pickle. Then you can use list slicing to get the elements in between

l = ['Warning','Not','News','Legends','Name1','Name2','Name3','Pickle','Stop','Hello']
l[l.index('Legends')+1 : l.index('Pickle')]
['Name1', 'Name2', 'Name3']

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521239

For the second approach, you could create a regex alternation of expected matching names, then use a list comprehension to generate a list of matches:

tags = ['Warning', 'Not', 'News', 'Legends', 'Name1', 'Name2', 'Name3', 'Pickle', 'Stop', 'Hello']
names = ['Name1', 'Name2', 'Name3']
regex = r'^(?:' + r'|'.join(names) + r')$'
matches = [x for x in tags if re.search(regex, x)]
print(matches)  # ['Name1', 'Name2', 'Name3']

Upvotes: 2

sarartur
sarartur

Reputation: 1228

Try this:

words = [
    "Warning", "Not", "News", "Legends", "Name1",
    "Name2", "Name3", "Pickle", "Stop", "Hello"
]
words_in_between = words[words.index("Legends") + 1:words.index("Pickle")]
print(words_in_between)

output:

['Name1', 'Name2', 'Name3']

This assumes that both "Legends" and "Pickle" are in the list exactly once.

Upvotes: 1

PeptideWitch
PeptideWitch

Reputation: 2349

You can use the list.index() method to find the numerical index of an item within a list, and then use list slicing to return the items in your list between those two points:

your_list = ['Warning','Not','News','Legends','Name1','Name2','Name3','Pickle','Stop','Hello']
your_list[your_list.index('Legends')+1:your_list.index('Pickle')]

The caveat is that .index() returns only the index of the first occurrence of the given item, so if your list has two 'legends' items, you'll only return the first index.

Upvotes: 1

Related Questions