Reputation: 2743
How do I determine if the string contains non-ASCII AND exceeds 5 characters using RegEx?
I tried this pattern: (?=\P{ASCII})(?=^.{6,}$)
I thought (?=)
means (?=must be this)(?=and this too).
Given this input: 1巻345
I'm expecting matcher find()
to return false.
Given this input: 1巻34567
I'm expecting matcher find()
to return true.
But it always returns false on both inputs.
Please also explain why my given pattern doesn't work.
UPDATE:
I figured the right pattern: (\P{ASCII})(.{6,})
Now I only need to know why (?=)
doesn't work.
Upvotes: 1
Views: 1156
Reputation: 14921
What you're looking for is:
^(?=.*\P{ASCII}).{6,}$
So let's explain it:
^ # Begin of string
(?= # Take a look and make sure if there is
.* # Anything zero or more times (greedy)
\P{ASCII} # A non-ascii character
) # End of lookahead
.{6,} # Match any character 6 or more times
$ # End of string
Let's analyse why your pattern fails (?=\P{ASCII})(?=^.{6,}$)
:
(?=\P{ASCII})
you're first telling the regex engine to check if there is a non-ascii character.(?=^.{6,}$)
then you're telling the regex engine to check if it's the beginning of string with ^
in the lookahead, and then checking if there is 6 or more characters.Now look at your input, you've got 1巻34567
. And you're telling the regex engine if the first character is non-ascii, which is false since the first character is 1
. Try 巻345671
as input and it should output true.
Note that .
doesn't match newline. So you might want to set the s
modifier by using (?s)
:(?s)^(?=.*\P{ASCII}).{6,}$
.
Upvotes: 5