Reputation: 3018
I have a string like below
Hello there how are you?
I want to look for the substring 'there how'
in the string. So I would do something like this
import re
string = "Hello there how are you?"
term = "there how"
print(re.search("\s" + term + "\s", string).group(0)). # /s is used to ensure the match should be an independent phrase
But now the problem is, if I get a variation of the string, then the match doesn't occur. For example for strings like this
If there is a large amount of space between the words
Hello there how are you?
If certain letters are capitialized
Hello There How are you?
What I want to do is to ensure as long as the substring 'there how'
is present in the string as a separate phrase (not like Hellothere how are you?
or Hello there howare you?
etc), I should be able to find a match.
How can I achieve the objective?
Upvotes: 2
Views: 151
Reputation: 626804
You may replace spaces with \s+
in the term
and use a case insensitive matching by passing re.I
flag:
import re
ss = ["Hello there how are you?", "Hello there how are you?", "Hello There How are you?"]
term = "there how"
rx = re.compile(r"(?<!\S){}(?!\S)".format(term.replace(r" ", r"\s+")), re.I)
for s in ss:
m = re.search(rx, s)
if m:
print(m.group())
Output:
there how
there how
There How
See the Python demo
NOTE: If the term
can contain special regex metacharacters, you need to re.escape
the term
, but do it before replacing spaces with \s+
. Since spaces are escaped with re.escape
, you need to .replace(r'\ ', r'\s+')
:
rx = re.compile(r"(?<!\S){}(?!\S)".format(re.escape(term).replace(r"\ ", r"\s+")), re.I)
JavaScript solution:
var ss = ["Hello there how are you?", "Hello there how are you?", "Hello There How are you?"];
var term = "there how";
var rx = new RegExp("(?<!\\S)" + term.replace(/ /g, "\\s+") + "(?!\\S)", "i");
for (var i=0; i<ss.length; i++) {
var m = ss[i].match(rx) || "";
console.log(m[0]);
}
Upvotes: 2