Reputation: 319
I've tried to follow the solution described here: https://stackoverflow.com/a/17973873/2149915 to try and match a string with the following requirements: - More than 3 characters repeated sequentially in the string should be matched and returned.
Examples:
and so on and so forth, the idea is to detect text that is nonsensical.
So far my solution was to modify the regex in the link as such.
ORIGINAL: ^(?!.*([A-Za-z0-9])\1{2})(?=.*[a-z])(?=.*\d)[A-Za-z0-9]+$
ADAPTED: ^(?!.*([A-Za-z0-9\.\,\/\|\\])\1{3})$
Essentially i removed the requirement for capture groups of numbers and alphanumerics seen here: (?=.*[a-z])(?=.*\d)[A-Za-z0-9]+
and tried to add extra detection of characters such as ./,\
etc but it doesnt seem to match at all with any characters...
Any ideas on how i can achieve this?
thanks in advance :)
EDIT:
i found this regex: ^.*(\S)(?: ?\1){9,}.*$
on this question https://stackoverflow.com/a/44659071/2149915 and have adapted it to match only for 3 characters like such ^.*(\S)(?: ?\1){3}.*$
.
Now it detects things like:
however it does not take into account whitespace such as this:
. . . . .
is there a modification that can be done to achieve this?
Upvotes: 6
Views: 2976
Reputation: 627327
To disallow four or more consecutive chars in the string, you need
^(?!.*(.)\1{3,}).*
See the regex demo. If you do not allow an empty string, replace last .*
with .+
. Details:
^
- start of string(?!.*(.)\1{3,})
- a negative lookahead that fails the match if there are zero or more chars other than line break chars as many as possible, and then a char captured into Group 1 that is followed with three occurrences of the same char.*
- any zero or more chars other than line break chars as many as possible (not necessary if you use a method that does not require a full string match, like [Matcher#find][2]
or regex.ContainsMatchIn
).Here is a Kotlin demo (for a change):
import java.util.*
fun main(args: Array<String>) {
val texts = arrayOf<String>("hello how are you...","hiii", "hello how are you.............","hiiiiii")
val re = """^(?!.*(.)\1{3,}).*""".toRegex()
for(text in texts) {
val isValid = re.containsMatchIn(text)
println("${text}: ${isValid}")
}
}
Output:
hello how are you...: true
hiii: true
hello how are you.............: false
hiiiiii: false
NOTE:
If you do not want to limit to consecutive repeatitions modify the pattern above as follows:
^(?!.*(.)(?:.*?\1){3,}).*
See this regex demo. The (?:.*?\1){3,}
regex matches three or more occurrences of any zero or more chars other than line break chars as few as possible and then the Group 1 value.
To match across line breaks, replace .
with [\s\S]
or add (?s)
at the start of the pattern.
To limit the repetitions to letters, replace (.)
with ([a-zA-Z])
or (\p{L})
, and if you need to only check repeated digits, replace (.)
with (\d)
or ([0-9])
.
Upvotes: 1
Reputation: 48434
I think there's a much simpler solution if you're looking for any character repeated more than 3 times:
String[] inputs = {
"hello how are you...", // -> VALID
"hello how are you.............", // -> INVALID
"hiii", // -> VALID
"hiiiiii" // -> INVALID
};
// | group 1 - any character
// | | back-reference
// | | | 4+ quantifier including previous instance
// | | | | dot represents any character,
// | | | | including whitespace and line feeds
// | | | |
Pattern p = Pattern.compile("(.)\\1{3,}", Pattern.DOTALL);
// iterating test inputs
for (String s: inputs) {
// matching
Matcher m = p.matcher(s);
// 4+ repeated character found
if (m.find()) {
System.out.printf(
"Input '%s' not valid, character '%s' repeated more than 3 times%n",
s,
m.group(1)
);
}
}
Output
Input 'hello how are you............. not valid', character '.' repeated more than 3 times
Input 'hiiiiii' not valid, character 'i' repeated more than 3 times
Input 'hello how are you' not valid, character ' ' repeated more than 3 times
Upvotes: 3