Rubel Ahmed
Rubel Ahmed

Reputation: 81

How to find the longest common suffix prefix between two strings in python in a pythonic way possibly using library functions?

Let's say I have two strings, s1 = "1234" and s2 ="34567", so longest common suffix prefix between s1 and s2 is "34". I want to know if there exists any pythonic way to get this matching part ("34") real quick.

I can do it in a naive way like this one below, but I would love to know if there is an interesting library function or algorithm to get this done.

s1 = "1234"
s2 = "34567"
length1 = len(s1) 
length2 = len(s2)

length = (length1 if length1<= length2 else length2)

for i in reversed(range(0, length)):
    if s1[-i - 1:] == s2[:i + 1]:
        print(s1[-i - 1:])
        break
    elif i > 0:
        continue
    else:
        print("no common suffix prefix")

Output:

34

I want something compact and smart!

Upvotes: 3

Views: 5401

Answers (3)

RootTwo
RootTwo

Reputation: 4418

Here are a couple alternative implementations:

You know that the suffix of s1 must start with s2[0]. So use s1.find(s[0]) to find candidate starting points. Also, s2.startswith() can be used instead of iterating over s2. I don't know if it is faster, but the intent is clear.

def suffix_prefix_1(s1, s2):
    i = s1.find(s2[0])
    while i >= 0:
        if s2.startswith(s1[i:]):
            return s1[i:]

        i = s1.find(s2[0], i+1)

    return ''

If you're using Python 3.8, the walrus operator lets you write it like this:

def suffix_prefix_1A(s1, s2):
    while (i := s1.find(s2[0])) >= 0:
        if s2.startswith(s1[i:]):
            return s1[i:]

    return ''

The same thing can be done using s1.endswith():

def suffix_prefix_2(s1, s2):
    e= len(s2)
    while e > 0:
        if s1.endswith(s2[:e]):
            return s2[:e]
        e = s2.rfind(s1[-1], 0, e-1) + 1

    return ''

And just for fun, let's use a regex:

import re

def suffix_prefix_3(s1, s2):
    match = re.search(f"^{'?'.join(s1)}", s2)
    return match[0] if match else ''

Upvotes: 1

Mad Physicist
Mad Physicist

Reputation: 114310

The logic in your algorithm is about as straightforward as you can get, but you can definitely compactify the notation. For example, checking a prefix of size n against a suffix of size n is simply:

s1[-n:] == s2[:n]

The ternary operator you use to check string lengths is

min(len(s1), len(s2))

A range can go backwards by itself. The reverse of range(x) is

range(x - 1, -1, -1)

You can create an iterator that checks this for each decreasing value of n and return the first non-zero result. Luckily, next accepts a second argument that represents the default if the iterator is empty:

common = next((s2[:n] for n in range(min(len(s1), len(s2)) - 1, -1, -1) if s1[-n:] == s2[:n]), '')

That's the obligatory one-liner. A more legible solution might be:

def common_fix(s1, s2):
    steps = range(min(len(s1), len(s2)) - 1, -1, -1)
    return next((s2[:n] for n in steps if s1[-n:] == s2[:n]), '')

As a rule, keep your functionally and printing separate. Get a value, then process it (whether by printing or something else)

Upvotes: 4

Chris
Chris

Reputation: 363

This works:

s1="1234"
s2="34567"
for i in range(len(s1)):
    if s1[i] == s2[0]:
        if s1[i::] in s2[0:len(s1[i::])]:
            print(s1[i::])

The for loop finds the length of s1. It then iterates for that length. If s1[i] is then equal to the start of s2, it checks to see if s1[i::] is in s2. If this is then true, it prints out s1[i::]

Upvotes: 0

Related Questions