Chase Leibowitz
Chase Leibowitz

Reputation: 29

How to count occurences of word in string that stil works with periods and endings

so I was recently working on this function here:

# counts owls
def owl_count(text):
    # sets all text to lowercase
    text = text.lower()
    
    # sets text to list
    text = text.split()
    
    # saves indices of owl in list
    indices = [i for i, x in enumerate(text) if x == ["owl"] ]
    
    # counts occurences of owl in text
    owl_count = len(indices)
    
    # returns owl count and indices
    return owl_count, indices

My goal was to count how many times "owl" occurs in the string and save the indices of it. The issue I kept running into was that it would not count "owls" or "owl." I tried splitting it into a list of individual characters but I couldn't find a way to search for three consecutive elements in the list. Do you guys have any ideas on what I could do here?

PS. I'm definitely a beginner programmer so this is probably a simple solution.

thanks!

Upvotes: 2

Views: 134

Answers (2)

Harsh
Harsh

Reputation: 31

Python has built in functions for these.These types of matching of strings comes under something called Regular Expressions,which you can go into detail later

a_string = "your string"
substring = "substring that you want to check"

matches = re.finditer(substring, a_string)


matches_positions = [match.start() for match in matches]

print(matches_positions)

finditer() will return an iteration object and start() will return the starting index of the found matches.

Simply put ,it returns indices of all the substrings in the string

Upvotes: 1

vurmux
vurmux

Reputation: 10020

If you don't want to use huge libraries like NLTK, you can filter words that starts with 'owl', not equal to 'owl':

indices = [i for i, x in enumerate(text) if x.startswith("owl")]

In this case words like 'owlowlowl' will pass too, but one should use NLTK to properly tokenize words like in real world.

Upvotes: 1

Related Questions