Longest Common Substring (find the substring)

I am looking for a way that my result is not the common number of the substring, what I need is the largest common substring. for example:

s1: abcee12345
s2: abcrd12345

The result I need is: value: 5, 12345 I need to make with memoization

My code is

    def memoization(s1, s2):
    mem = {}

    def getKey(l1, l2, count):
        key = str(l1) + "|" + str(l2) + "|" + str(count)
        return key

    def findLengthLCS(mem, s1, s2, l1, l2, count):
        key = getKey(l1, l2, count)
        if l1 == len(s1) or l2 == len(s2):
            return count
        if key not in mem:
            c1 = count
            if s1[l1] == s2[l2]:
                c1 = findLengthLCS(mem, s1, s2, l1+1, l2+1, count+1)
            c2 = findLengthLCS(mem, s1, s2, l1, l2+1, 0)
            c3 = findLengthLCS(mem, s1, s2, l1+1, l2, 0)
            mem[key] = max(c1, max(c2, c3))
        return mem[key]


       def getstring(s1, s2):
         resultado = ""
         i = len(s1)
         k = len(s2)
         while k > 0 and i >= 0:
            key_i = getKey(i, k, 0)
            key_i1 = getKey(i - 1, k, 0)
         assert key_i in mem
         assert key_i1 in mem

         if mem[key_i] != mem[keºi1]:
               resultado += s1[i]

         k = k - 1
         i = i - 1
         return resultado

     value = findLengthLCS(mem, s1, s2, 0, 0, 0)
     resultadofinal = getstring(s1, s2)
     return value, resultadofinal

Upvotes: 0

Views: 132

Answers (1)

huzy
huzy

Reputation: 38

try this

# Longest Common Substring – Memoization
mem = None

def LCStr(s1, s2, ancester):
 
    if s1 == "" or s2 == "":
        return ancester
    
    if mem[len(s1)][len(s2)] != 0 and ancester == "": 
        return mem[len(s1)][len(s2)]

    case1 = ""
    if s1[-1] == s2[-1]:
        ancester = s1[-1] + ancester
        case1 = LCStr(s1[:-1], s2[:-1], ancester)


    case2 = LCStr(s1, s2[:-1], "")
    case3 = LCStr(s1[:-1], s2, "")
    
    res = max_len(case1, case2, case3, ancester)
    mem[len(s1)][len(s2)] = res
    
    return res
    
def max_len(s1, s2, s3, s4):
    
    m = max(len(s1), len(s2), len(s3), len(s4))
    
    if len(s1) == m:
        return s1
    elif len(s2) == m:
        return s2
    elif len(s3) == m:
        return s3
    else:
        return s4


X = "dxidjfhuxxc"
Y = "uxxcfidipxc"

mem = [[0] * (len(Y)+1) for _ in range(len(X)+1)]
ans = LCStr(X, Y, "")

print(len(ans), ans)

output

4 uxxc

Upvotes: 1

Related Questions