wayfare
wayfare

Reputation: 1850

Merge two strings to shorten them

I am working on a problem in which I need to merge two string such that one string can be inside other one. The merged string should be of shortest length.

Example:

str1 = "AABAK"
str2 = "HYUAABA"
merged string = "HYUAABAK"

So far I was able to make it work for string which are ending in similar substrings but if they are other way round, my solution isn't working.

Failing for:

str1 = 'ctaagt'
str2 = 'gcta'
expected answer: gctaagt

Solution:

def overlap(str1, str2):
    l = min(len(str1), len(str2))
    for i in range(l, -1, -1):
        if str1.endswith(str2[-i:]):
            print('overlap ' + str2[-i:])
            return str2[-i:]


str1 = "AABAK"
str2 = "HYUAABA"

for i in range(len(str1), -1, -1):
    res = overlap(str1[0:i], str2)
    if(res):
        print('merge ' + str2+str1[i:])
        break

I also want to know if there is any better or cleaner approach to solve this.

Note: str1 is on purpose shorter for my testing purpose.

Upvotes: 0

Views: 179

Answers (3)

Aurora Wang
Aurora Wang

Reputation: 1940

You can calculate your overlap and do your string merge in the same loops quite simply:

def merge(str1, str2):
    str2_len = len(str2)
    for i in range(str2_len):
        # edited to only match correctly
        if str1.startswith(str2[i:]):
            return str2 + str1[str2_len-i:]
    return str2 + str1

>>> str1 = "AABAK"

>>> str2 = "HYUAABA"

>>> merge(str1, str2)

'HYUAABAK'


>>> str1 = 'ctaagt'

>>> str2 = 'gcta'

>>> merge(str1, str2)

'gctaagt'

Upvotes: 0

slider
slider

Reputation: 12990

You can modify overlap to return the final overlapped string. Then, you can consider both possible arrangements of str1 and str2 and choose the final result with min for which key is the length of the string:

def overlap(str1, str2): # best possible overlap where str1 is 1st and str2 is 2nd
    for i in range(len(str1)):
        if str2.startswith(str1[i:]):
            return str1[:i] + str2
    return str1 + str2

str1 = 'ctaagt'
str2 = 'gcta'

result = min(overlap(str1, str2), overlap(str2, str1), key=len)
print(result) # gctaagt

Upvotes: 1

The Pineapple
The Pineapple

Reputation: 567

Using tobias_k's solution from a previous post as reference you can do something like the following. And then just compare the output from a,b and b,a to see which resulting string is shorter.

from functools import reduce

a = "AABAK"
b = "HYUAABA"


def overlap(a, b):
    return max(i for i in range(len(b)) if b[i - 1] == a[-1] and a.endswith(b[:i]))


reduce(lambda a, b: a + b[overlap(a, b):], [b, a])

Upvotes: 0

Related Questions