Igor
Igor

Reputation: 866

Regex to validate 4 different characters are in a string

I would like to enforce that 4 different characters will be in a string.

Valid examples:

"1q2w3e4r5t"
"abcd"

Invalid examples:

"good"
"1ab1"

Ideas for a pattern?

Upvotes: 0

Views: 182

Answers (3)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627302

You can use a regular expression to validate this, with negative look-aheads checking that the captured alphanumeric character is not the same 4 times.

I'd say it is very ugly, but working:

String rx = "^(.).*?((?!\\1).).*?((?!\\1|\\2).).*?((?!\\1|\\2|\\3).).*?$"

See demo

IDEONE Demo

String re = "^(.).*?((?!\\1).).*?((?!\\1|\\2).).*?((?!\\1|\\2|\\3).).*?$"; 
// Good
System.out.println("1q2w3e4r5t".matches(re));
System.out.println("goody".matches(re));
System.out.println("gggoooggoofr".matches(re));
// Bad
System.out.println("good".matches(re));
System.out.println("1ab1".matches(re));

Output:

true
true
true
false
false

Upvotes: 1

nhahtdh
nhahtdh

Reputation: 56829

You should consider using a non-regex solution. I only write this answer to show a simpler regex solution for this problem.

Initial solution

Here is a simpler regex solution, which asserts that there are at least 4 distinct characters in the string:

(.).*?((?!\1).).*?((?!\1|\2).).*?((?!\1|\2|\3).).*

Demo on regex101 (PCRE and Java has the same behavior for this regex)

.*?((?!\1).), .*?((?!\1|\2).), ... searches for the next character which has not appeared before, which is implemented by the checking the character is not the same as whatever captured in previous capturing groups.

Logically, the laziness/greediness of the quantifier doesn't matter here. The lazy quantifier .*? is used to make the search start from the closest character which has not appeared before, rather than from the furthest character. It should slightly improve the performance in matching case, since less backtracking is done.

Used with String.matches(), which asserts that the whole string matches the regex:

input.matches("(.).*?((?!\\1).).*?((?!\\1|\\2).).*?((?!\\1|\\2|\\3).).*")

Improved solution

If you are concerned about performance:

(.)(?>.*?((?!\1).))(?>.*?((?!\1|\2).))(?>.*?((?!\1|\2|\3).)).*

Demo on regex101

With String.matches():

input.matches("(.)(?>.*?((?!\\1).))(?>.*?((?!\\1|\\2).))(?>.*?((?!\\1|\\2|\\3).)).*")

The (?>pattern) construct prevents backtracking into the group once you exit from the pattern inside. This is used to "lock" the capturing groups to the first appearance of each of the distinct character, since the result is the same even if you pick a different character later in the string.

This regex behaves the same as a normal program which loops from left-to-right, checks the current character against a set of distinct characters and adds it to the set if the current character is not in the set.

Due to this reason, the lazy quantifier .*? becomes significant, since it searches for the closest character which has not appeared so far.

Upvotes: 3

K Erlandsson
K Erlandsson

Reputation: 13696

You can count the number of distinct chars like this:

String s = "abcdefaa";
long numDistinctChars = s.chars().distinct().count()

Or if not on Java 8 (I couldn't come up with something better):

Set<Character> set = new HashSet<>();
char[] charArray = s.toCharArray();
for (char c : charArray) {
    set.add(Character.valueOf(c));
}
int numDistinctChars = set.size();

Upvotes: 1

Related Questions