Python_Learner
Python_Learner

Reputation: 1637

How to remove/delete characters from end of string that match another end of string

I have thousands of strings (not in English) that are in this format:

['MyWordMyWordSuffix', 'SameVocabularyItemMyWordSuffix']

I want to return the following:

['MyWordMyWordSuffix', 'SameVocabularyItem']

Because strings are immutable and I want to start the matching from the end I keep confusing myself on how to approach it.

My best guess is some kind of loop that starts from the end of the strings and keeps checking for a match.

However, since I have so many of these to process it seems like there should be a built in way faster than looping through all the characters, but as I'm still learning Python I don't know of one (yet).

The nearest example I could find already on SO can be found here but it isn't really what I'm looking for.

Thank you for helping me!

Upvotes: 1

Views: 37

Answers (1)

glhr
glhr

Reputation: 4537

You can use commonprefix from os.path to find the common suffix between them:

from os.path import commonprefix

def getCommonSuffix(words):
    # get common suffix by reversing both words and finding the common prefix
    prefix = commonprefix([word[::-1] for word in words])
    return prefix[::-1]

which you can then use to slice out the suffix from the second string of the list:

word_list = ['MyWordMyWordSuffix', 'SameVocabularyItemMyWordSuffix']

suffix = getCommonSuffix(word_list)
if suffix:
    print("Found common suffix:", suffix)

    # filter out suffix from second word in the list
    word_list[1] = word_list[1][0:-len(suffix)]
    print("Filtered word list:", word_list)
else:
    print("No common suffix found")

Output:

Found common suffix: MyWordSuffix
Filtered word list: ['MyWordMyWordSuffix', 'SameVocabularyItem']

Demo: https://repl.it/@glhr/55705902-common-suffix

Upvotes: 1

Related Questions