Reputation: 93968
If I would want to make a 100% clone of String#contains(CharSequence s): boolean
in Java regex using Pattern
. Would the following calls be identical?
input.contains(s);
and
Pattern.compile(".*" + Pattern.quote(s) + ".*").matcher(input).matches();
Similarly, would the following code have the same functionality?
Pattern.compile(Pattern.quote(s)).matcher(input).find();
I presume that the regex search is less performant, but only by a constant factor. Is this correct? Is there any way to optimize the regular expressions to mimic contains
?
The reason that I'm asking is that I have a piece of code that is written around Pattern
and it seems wasteful to create a separate piece of code that uses contains
. On the other hand, I don't want different test results - even minor ones - for each code. Are there any Unicode related differences, for instance?
Upvotes: 1
Views: 434
Reputation: 93968
This just to share how I decided to solve this little conundrum. I've redesigned by library to not take a Pattern
but to take a predicate, like this:
public static Set<String> findAll() {
return find(input -> true);
}
public static Set<String> findSubstring(String s) {
return find(input -> input.contains(s));
}
public static Set<String> findPattern(Pattern p) {
return find(p.asPredicate());
}
public static Set<String> findCaseInsensitiveSubstring(String s) {
return find(Pattern.compile(Pattern.quote(s), Pattern.CASE_INSENSITIVE).asPredicate());
}
private static Set<String> find(Predicate<String> matcher) {
var testInput = Set.of("some", "text", "to", "test");
return testInput.stream().filter(matcher).collect(Collectors.toSet());
}
public static void main(String[] args) {
System.out.println(findAll());
System.out.println(findSubstring("t"));
System.out.println(findPattern(Pattern.compile("^[^s]")));
System.out.println(findCaseInsensitiveSubstring("T"));
}
where I've used all the comments and answers given up to now.
Note that there is also Pattern#asMatchPredicate()
in case matching is required instead, e.g. for a function matchPattern
.
Of course above is just a demonstration, not the actual functions in my solution.
Upvotes: 1
Reputation: 102
There are 2 ways to see if a String matches a Pattern:
return Pattern.compile(Pattern.quote(s)).asPredicate().test(input);
or
return Pattern.compile(Pattern.quote(s)).matcher.find(input);
There is no need for matching on .*. this will match anything surrounding the actual result and just be overhead.
Upvotes: 1
Reputation: 626758
If you need to write a .contains
like method based on Pattern
, you should choose the Matcher#find()
version:
Pattern.compile(Pattern.quote(s)).matcher(input).find()
If you want to use .matches()
, you should bear in mind that:
.*
will not match line breaks by default and you need (?s)
inline modifier at the start of the pattern or use Pattern.DOTALL
option.*
at the pattern start will cause too much backtracking and you may get a stack overflow exception, or the code execution might just freeze.Upvotes: 3