Zeke John
Zeke John

Reputation: 127

Find all the variations (or tenses) of a word in Python

I would like to know how you would find all the variations of a word, or the words that are related or very similar the the original word in Python.

An example of the sort of thing I am looking for is like this:

word = "summary" # any word

word_variations = find_variations_of_word(word) # a function that finds all the variations of a word, What i want to know how to make

print(word_variations)

# What is should print out: ["summaries", "summarize", "summarizing", "summarized"]

This is just an example of what the code should do, i have seen other similar question on this same topic, but none of them were accurate enough, i found some code and altered it to my own, which kinda works, but now to way i would like it to.

import nltk
from nltk.corpus import wordnet
from nltk.stem import WordNetLemmatizer

lemmatizer = WordNetLemmatizer()

def find_inflections(word):
    inflections = []
    for synset in wordnet.synsets(word):  # Find all synsets for the word
        for lemma in synset.lemmas():  # Find all lemmas for each synset
            inflected_form = lemma.name().replace("_", " ")  # Get the inflected form of the lemma
            if inflected_form != word:  # Only add the inflected form if it's different from the original word
                inflections.append(inflected_form)
    return inflections

word = "summary"
inflections = find_inflections(word)
print(inflections)  
# Output: ['sum-up', 'drumhead', 'compendious', 'compact', 'succinct']
# What the Output should be: ["summaries", "summarize", "summarizing", "summarized"]

Upvotes: 2

Views: 332

Answers (1)

Paul Williams
Paul Williams

Reputation: 51

This probably isn't of any use to you, but may help someone else who finds this with a search -

If the aim is just to find the words, rather than specifically to use a machine-learning approach to the problem, you could try using a regular expression (regex).

w3 schools seems to cover enough to get the result you want here or there is a more technical overview on python.org

to search case insensitively for the specific words you listed the following would work:

import re

string =    "A SUMMARY ON SUMMATION:" \
            "We use summaries to summarize. This action is summarizing. " \
            "Once the action is complete things have been summarized."


occurrences = re.findall("summ[a-zA-Z]*", string, re.IGNORECASE)
    
print(occurrences)

However, depending on your precise needs you may need to modify the regular expression as this would also find words like 'summer' and 'summon'.

I'm not very good at regex but they can be a powerful tool if you know precisely what you are looking for and spend a little time crafting the right expression.

Sorry this probably isn't relevant to your circumstance but good luck.

Upvotes: 2

Related Questions