Reputation: 853
My current implementation is this:
def find_longest_matching_option(option, options):
options = sorted(options, key=len)
longest_matching_option = None
for valid_option in options:
# Don't want to treat "oreo" as matching "o",
# match only if it's "o reo"
if re.match(ur"^{}\s+".format(valid_option), option.strip()):
longest_matching_option = valid_option
return longest_matching_option
Some examples of what I'm trying to do:
"foo bar baz something", ["foo", "foo bar", "foo bar baz"]
# -> "foo bar baz"
"foo bar bazsomething", (same as above)
# -> "foo bar"
"hello world", ["hello", "something_else"]
# -> "hello"
"a b", ["a", "a b"]
# -> "a b" # Doesn't work in current impl.
Mostly, I'm looking for efficiency here. The current implementation works, but I've been told it's O(m^2 * n)
, which is pretty bad.
Thanks in advance!
Upvotes: 1
Views: 1565
Reputation: 403278
Let's start with foo
.
def foo(x, y):
x, y = x.strip(), y.strip()
return x == y or x.startswith(y + " ")
foo
returns true either if two strings are equal, or one (plus a space) is a substring of the other.
Next, given a case string, and a list of options, you can use filter
to find all valid substrings for the given case string, and then apply max
to find the longest one (see tests below).
Here's a few test cases for foo
. For the purpose of demonstrating, I'll use partial
to curry foo
to a higher order function.
from functools import partial
cases = ["foo bar baz something", "foo bar bazsomething", "hello world", "a b", "a b"]
options = [
["foo", "foo bar", "foo bar baz"],
["foo", "foo bar", "foo bar baz"],
["hello", "something_else"],
["a", "a b"],
["a", "a b\t"]
]
p_list = [partial(foo, c) for c in cases]
for p, o in zip(p_list, options):
print(max(filter(p, o), key=len))
foo bar baz
foo bar
hello
a b
a b
Upvotes: 2
Reputation: 43356
Regex is overkill here; you can simply append a space to each string before comparing them to get the same result.
You also don't need to sort the data. It's more efficient to simply loop over every value.
def find_longest_matching_option(option, options):
# append a space so that find_longest_matching_option("a b", ["a b"])
# works as expected
option += ' '
longest = None
for valid_option in options:
# append a space to each option so that only complete
# words are matched
valid_option += ' '
if option.startswith(valid_option):
# remember the longest match
if longest is None or len(longest) < len(valid_option):
longest = valid_option
if longest is not None:
# remove the trailing space
longest = longest[:-1]
return longest
Upvotes: 1