Reputation: 866
I would like to enforce that 4 different characters will be in a string.
Valid examples:
"1q2w3e4r5t"
"abcd"
Invalid examples:
"good"
"1ab1"
Ideas for a pattern?
Upvotes: 0
Views: 182
Reputation: 627302
You can use a regular expression to validate this, with negative look-aheads checking that the captured alphanumeric character is not the same 4 times.
I'd say it is very ugly, but working:
String rx = "^(.).*?((?!\\1).).*?((?!\\1|\\2).).*?((?!\\1|\\2|\\3).).*?$"
See demo
String re = "^(.).*?((?!\\1).).*?((?!\\1|\\2).).*?((?!\\1|\\2|\\3).).*?$";
// Good
System.out.println("1q2w3e4r5t".matches(re));
System.out.println("goody".matches(re));
System.out.println("gggoooggoofr".matches(re));
// Bad
System.out.println("good".matches(re));
System.out.println("1ab1".matches(re));
Output:
true
true
true
false
false
Upvotes: 1
Reputation: 56829
You should consider using a non-regex solution. I only write this answer to show a simpler regex solution for this problem.
Here is a simpler regex solution, which asserts that there are at least 4 distinct characters in the string:
(.).*?((?!\1).).*?((?!\1|\2).).*?((?!\1|\2|\3).).*
Demo on regex101 (PCRE and Java has the same behavior for this regex)
.*?((?!\1).)
, .*?((?!\1|\2).)
, ... searches for the next character which has not appeared before, which is implemented by the checking the character is not the same as whatever captured in previous capturing groups.
Logically, the laziness/greediness of the quantifier doesn't matter here. The lazy quantifier .*?
is used to make the search start from the closest character which has not appeared before, rather than from the furthest character. It should slightly improve the performance in matching case, since less backtracking is done.
Used with String.matches()
, which asserts that the whole string matches the regex:
input.matches("(.).*?((?!\\1).).*?((?!\\1|\\2).).*?((?!\\1|\\2|\\3).).*")
If you are concerned about performance:
(.)(?>.*?((?!\1).))(?>.*?((?!\1|\2).))(?>.*?((?!\1|\2|\3).)).*
With String.matches()
:
input.matches("(.)(?>.*?((?!\\1).))(?>.*?((?!\\1|\\2).))(?>.*?((?!\\1|\\2|\\3).)).*")
The (?>pattern)
construct prevents backtracking into the group once you exit from the pattern inside. This is used to "lock" the capturing groups to the first appearance of each of the distinct character, since the result is the same even if you pick a different character later in the string.
This regex behaves the same as a normal program which loops from left-to-right, checks the current character against a set of distinct characters and adds it to the set if the current character is not in the set.
Due to this reason, the lazy quantifier .*?
becomes significant, since it searches for the closest character which has not appeared so far.
Upvotes: 3
Reputation: 13696
You can count the number of distinct chars like this:
String s = "abcdefaa";
long numDistinctChars = s.chars().distinct().count()
Or if not on Java 8 (I couldn't come up with something better):
Set<Character> set = new HashSet<>();
char[] charArray = s.toCharArray();
for (char c : charArray) {
set.add(Character.valueOf(c));
}
int numDistinctChars = set.size();
Upvotes: 1