Reputation: 310
Objective: for a given term, I want to check if that term exist at the start of the word. For example if the term is 't'. then in the sentance:
"This is the difficult one Thats it"
I want it to return "true" because of :
This, the, Thats
so consider:
public class HelloWorld{
public static void main(String []args){
String term = "t";
String regex = "/\\b"+term+"[^\\b]*?\\b/gi";
String str = "This is the difficult one Thats it";
System.out.println(str.matches(regex));
}
}
I am getting following Exception:
Exception in thread "main" java.util.regex.PatternSyntaxException:
Illegal/unsupported escape sequence near index 7
/\bt[^\b]*?\b/gi
^
at java.util.regex.Pattern.error(Pattern.java:1924)
at java.util.regex.Pattern.escape(Pattern.java:2416)
at java.util.regex.Pattern.range(Pattern.java:2577)
at java.util.regex.Pattern.clazz(Pattern.java:2507)
at java.util.regex.Pattern.sequence(Pattern.java:2030)
at java.util.regex.Pattern.expr(Pattern.java:1964)
at java.util.regex.Pattern.compile(Pattern.java:1665)
at java.util.regex.Pattern.<init>(Pattern.java:1337)
at java.util.regex.Pattern.compile(Pattern.java:1022)
at java.util.regex.Pattern.matches(Pattern.java:1128)
at java.lang.String.matches(String.java:2063)
at HelloWorld.main(HelloWorld.java:8)
Also the following does not work:
import java.util.regex.*;
public class HelloWorld{
public static void main(String []args){
String term = "t";
String regex = "\\b"+term+"gi";
//String regex = ".";
System.out.println(regex);
String str = "This is the difficult one Thats it";
System.out.println(str.matches(regex));
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(str);
System.out.println(m.find());
}
}
Example: { This , one, Two, Those, Thanks } for words This Two Those Thanks; result should be true.
Thanks
Upvotes: 2
Views: 6997
Reputation: 124215
In Java we don't surround regex with /
so instead of "/regex/flags"
we just write regex
. If you want to add flags you can do it with (?flags)
syntax and place it in regex at position from which flag should apply, for instance a(?i)a
will be able to find aa
and aA
but not Aa
because flag was added after first a
.
You can also compile your regex into Pattern
like this
Pattern pattern = Pattern.compile(regex, flags);
where regex
is String (again not enclosed with /
) and flag
is integer build from constants from Pattern like Pattern.DOTALL
or when you need more flags you can use Pattern.CASE_INSENSITIVE|Pattern.MULTILINE
.
Next thing which may confuse you is matches
method. Most people are mistaken by its name, because they assume that it will try to check if it can find
in string element which can be matched by regex, but in reality, it checks if entire string can be matched by regex.
What you seem to want is mechanism to test of some regex can be found at least once in string. In that case you may either
.*
at start and end of your regex to let other characters which are not part of element you want to find be matched by regex engine, but this way matches
must iterate over entire stringMatcher
object build from Pattern
(representing your regex), and use its find()
method, which will iterate until it finds match for regex, or will find end of string. I prefer this approach because it will not need to iterate over entire string, but will stop when match will be found. So your code could look like
String str = "This is the difficult one Thats it";
String term = "t";
Pattern pattern = Pattern.compile("\\b"+term, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(str);
System.out.println(matcher.find());
In case your term
could contain some regex special characters but you want regex engine to treat them as normal characters you need to make sure that they will be escaped. To do this you can use Pattern.quote
method which will add all necessary escapes for you, so instead of
Pattern pattern = Pattern.compile("\\b"+term, Pattern.CASE_INSENSITIVE);
for safety you should use
Pattern pattern = Pattern.compile("\\b"+Pattern.quote(term), Pattern.CASE_INSENSITIVE);
Upvotes: 1
Reputation: 807
String str = "This is the difficult one Thats it";
String term = "t";
Pattern pattern = Pattern.compile("^[+"+term+"].*",Pattern.CASE_INSENSITIVE);
String[] strings = str.split(" ");
for (String s : strings) {
if (pattern.matcher(s).matches()) {
System.out.println(s+"-->"+true);
} else {
System.out.println(s+"-->"+false);
}
}
Upvotes: 0
Reputation: 88707
Since you're using the Java regex engine, you need to write the expressions in a way Java understands. That means removing trailing and leading slashes and adding flags as (?<flags>)
at the beginning of the expression.
Thus you'd need this instead:
String regex = "(?i)\\b"+term+".*?\\b"
Have a look at regular-expressions.info/java.html for more information. A comparison of supported features can be found here (just as an entry point): regular-expressions.info/refbasic.html
Upvotes: 2
Reputation: 31290
String regex = "(?i)\\b"+term;
In Java, the modifiers must be inserted between "(?" and ")" and there is a variant for turning them off again: "(?-" and ")".
For finding all words beginning with "T" or "t", you may want to use Matcher's find method repeatedly. If you just need the offset, Matcher's start method returns the offset.
If you need to match the full word, use
String regex = "(?i)\\b"+term + "\\w*";
Upvotes: 0