Reputation: 35
I'm writing code that's pulling in data from a website and it's printing out all the text between certain tags. I am storing the result into a list every time the code pulls data from a tag so I have a list looking something like
Warning
Not
News
Legends
Name1
Name2
Name3
Pickle
Stop
Hello
I want to look into this list of strings and have code that'll find the keywords legends
and pickle
and print whatever strings are between them.
To elaborate in a further activity, I may create a whole list of all possible legend names
and then, if they occur whenever I generate my list, to print those out that reoccur. Any insight into any of these questions?
Upvotes: 0
Views: 586
Reputation: 449
numpys function where gives you all occurances of a given item. So first make the lsit a numpy array
my_array = numpy.array(["Warning","Not","News","Legends","Name1","Name2","Name3","Pickle","Stop","Hello","Legends","Name1","Name2","Name3","Pickle",])
From here on you can use methods of numpy:
legends = np.where(my_array == "Legends")
pickle = np.where(my_array == "Pickle")
concatinating for easier looping
stack = np.concatenate([legends, pickle], axis=0)
look for the values between legends and pickle
np.concatenate([my_list[stack[0, i] + 1:stack[1, i]] for i in range(stack.shape[0])] )
The result in my case is:
array(['Name1', 'Name2', 'Name3', 'Name1', 'Name2'], dtype='<U7')
Upvotes: 1
Reputation: 9047
You can use list.index()
to get the index of the first occurance of legends
and pickle
. Then you can use list slicing
to get the elements in between
l = ['Warning','Not','News','Legends','Name1','Name2','Name3','Pickle','Stop','Hello']
l[l.index('Legends')+1 : l.index('Pickle')]
['Name1', 'Name2', 'Name3']
Upvotes: 1
Reputation: 521239
For the second approach, you could create a regex alternation of expected matching names, then use a list comprehension to generate a list of matches:
tags = ['Warning', 'Not', 'News', 'Legends', 'Name1', 'Name2', 'Name3', 'Pickle', 'Stop', 'Hello']
names = ['Name1', 'Name2', 'Name3']
regex = r'^(?:' + r'|'.join(names) + r')$'
matches = [x for x in tags if re.search(regex, x)]
print(matches) # ['Name1', 'Name2', 'Name3']
Upvotes: 2
Reputation: 1228
Try this:
words = [
"Warning", "Not", "News", "Legends", "Name1",
"Name2", "Name3", "Pickle", "Stop", "Hello"
]
words_in_between = words[words.index("Legends") + 1:words.index("Pickle")]
print(words_in_between)
output:
['Name1', 'Name2', 'Name3']
This assumes that both "Legends"
and "Pickle"
are in the list exactly once.
Upvotes: 1
Reputation: 2349
You can use the list.index()
method to find the numerical index of an item within a list, and then use list slicing to return the items in your list between those two points:
your_list = ['Warning','Not','News','Legends','Name1','Name2','Name3','Pickle','Stop','Hello']
your_list[your_list.index('Legends')+1:your_list.index('Pickle')]
The caveat is that .index()
returns only the index of the first occurrence of the given item, so if your list has two 'legends' items, you'll only return the first index.
Upvotes: 1