Garvey
Garvey

Reputation: 1309

find overlap spans bewteen two strings

Given two string s1 and s2, I want to extract all overlap spans spans where len(spans)>=K

For example:

s1 = "Today is Friday. Nice weather, isn't it?"
s2 = "It's Black Friday today. "
K = 1

the expected answer is

spans = ["Friday"]   # Sensitive to big capital letter

here is my implement:

def norm(s):
    punctuation = [",", ".", "?", "!"]
    s = s.split()
    for i, x in enumerate(s):
        if any([x.endswith(p) for p in punctuation]):
            s[i] = x[:-1] + " " + x[-1]
    s = " ".join(s)
    s = s.split()
    return s


def func(s1,s2,K=1):
    punctuation = [",", ".", "?", "!"]
    s1 = norm(s1)
    s2 = norm(s2)
    spans = []
    for i, x in enumerate(s1):
        for j in range(K, len(s1)-K):
            cur_span = " ".join(s1[i:i+j])
            if cur_span in " ".join(s2):
                spans.append(cur_span)
    spans = [x for x in spans if x not in punctuation]
    return spans


s1 = "Today is Friday. Nice weather, isn't it?"
s2 = "It's Black Friday today. "
func(s1,s2,1)  # return ['Friday']

Seeking for better implement for this function

Upvotes: 0

Views: 64

Answers (1)

Sociopath
Sociopath

Reputation: 13401

You can use set.intersection along with split

s1_set = set([i.strip('.,?!') for i in s1.split()])
s2_set = set([i.strip('.,?!') for i in s2.split()])

print(s1_set.intersection(s2_set))
# {'Friday'}

Upvotes: 1

Related Questions