Reputation: 29
so I was recently working on this function here:
# counts owls
def owl_count(text):
# sets all text to lowercase
text = text.lower()
# sets text to list
text = text.split()
# saves indices of owl in list
indices = [i for i, x in enumerate(text) if x == ["owl"] ]
# counts occurences of owl in text
owl_count = len(indices)
# returns owl count and indices
return owl_count, indices
My goal was to count how many times "owl" occurs in the string and save the indices of it. The issue I kept running into was that it would not count "owls" or "owl." I tried splitting it into a list of individual characters but I couldn't find a way to search for three consecutive elements in the list. Do you guys have any ideas on what I could do here?
PS. I'm definitely a beginner programmer so this is probably a simple solution.
thanks!
Upvotes: 2
Views: 134
Reputation: 31
Python has built in functions for these.These types of matching of strings comes under something called Regular Expressions,which you can go into detail later
a_string = "your string"
substring = "substring that you want to check"
matches = re.finditer(substring, a_string)
matches_positions = [match.start() for match in matches]
print(matches_positions)
finditer() will return an iteration object and start() will return the starting index of the found matches.
Simply put ,it returns indices of all the substrings in the string
Upvotes: 1
Reputation: 10020
If you don't want to use huge libraries like NLTK, you can filter words that starts with 'owl'
, not equal to 'owl'
:
indices = [i for i, x in enumerate(text) if x.startswith("owl")]
In this case words like 'owlowlowl'
will pass too, but one should use NLTK to properly tokenize words like in real world.
Upvotes: 1