Reputation: 59

If multiple substrings match string in specific order

I'm wondering how to detect if two substrings match a main string in a specific order. For example if we're looking for "hours" and then "minutes" anywhere at all in a string, and the string is "what is 5 hours in minutes", it would return true. If the string was "what is 5 minutes in hours", it would return false.

Upvotes: 2

Answers (5)

Padraic Cunningham

Reputation: 180391

s = "what is 5 hours in minutes"
a, b = s.find("hours"),s.find("minutes")
print(-1 < a < b)

You could also avoid checking for b if a does not exist in the string:

 def inds(s, s1, s2):
    a = s.find(s1)
    return -1 < a < s.find(s2)

If you want to start at a + 1 it is trivial to change:

def inds(s, s1, s2):
    a = s.find(s1)
    return -1 < a < s.find(s2, a+1)

But if you always want to make sure that a comes before b then stick to the first solutions. You also did not say if sub strings can be matched i.e:

a = "foo"
b = "bar"

Would match:

"foobar"

But they are not actual words in the string. If you want to match actual words then you will either need to split and clean the text or use word boundaries with a regex.

If you want to match exact words and not partial matches then use a regex using word boundaries:

import re


def consec(s, *args):
    if not args:
        raise ValueError("args cannot be empty")
    it = iter(args)
    prev = re.search(r"\b{}\b".format(next(it)), s)
    if not prev:
        return False
    prev = prev.end() 
    for w in args:
        ind = re.search(r"\b{}\b".format(w), s, prev + 1)
        if not ind:
            return False
        prev = ind.end() 
    return True

Which won't match "foo" and "bar" in foobar:

In [9]: consec("foobar","foo","bar")
Out[9]: False

In [10]: consec("foobar bar for bar","foo","bar")
Out[10]: False

In [11]: consec("foobar bar foo bar","foo","bar")
Out[11]: True

In [12]: consec("foobar","foo","bar")
Out[12]: False

In [13]: consec("foobar bar foo bar","foo","bar")
Out[13]: True

In [14]: consec("","foo","bar")
Out[14]: False

In [15]: consec("foobar bar foo bar","foobar","foo","bar")
Out[15]: True

Upvotes: 2

Garrett R

Reputation: 2662

A regex will work well here. The regex r"hours.*minutes" says look for hours followed but 0 or more of any characters followed by minutes. Also, make sure to use the search function in the regex library rather than match, as match checks the from the beginning of the string.

import re
true_state ="what is 5 hours in minutes"
false_state = "what is 5 minutes in hours"
pat = re.compile(r"hours.*minutes")
statements = [true_state, false_state]
for state in statements:
    ans= re.search(pat, state)
    if ans:
        print state
        print ans.group()

Output

what is 5 hours in minutes
hours in minutes

Upvotes: 0

Alyssa Haroldsen

Reputation: 3731

This will work with any set of words and any string:

def containsInOrder(s, *words):
    last = -1
    for word in words:
        last = s.find(word, last + 1)
        if last == -1:
            return False
    return True

Used like so:

>>> s = 'what is 5 hours in minutes'
>>> containsInOrder(s, 'hours', 'minutes')
True
>>> containsInOrder(s, 'minutes', 'hours')
False
>>> containsInOrder(s, '5', 'hours', 'minutes')
True
>>> containsInOrder('minutes hours minutes', 'hours', 'minutes')
True
>>> containsInOrder('minutes hours minutes', 'minutes', 'hours')
True

Upvotes: 1

Jonathan

Reputation: 86

 if index(a) < index(b):
    True
 else:
    This

Use the index method to determine which one comes first. The if statement gives a conditional as to what you do once you find out which comes first. Do you understand what I'm trying to say?

Upvotes: 0

John Gordon

Reputation: 33285

You could use a regular expression such as "hours.*minutes", or you could use a simple string search that looks for "hours", notes the location where it is found, then does another search for "minutes" starting at that location.

Upvotes: 0

If multiple substrings match string in specific order

Answers (5)

Output

Related Questions