Reputation: 46851
I have a question that come to my mind when I answered this post to match ASCII characters except alphanumeric.
This is what I have tried but it's not correct.
(?=[\x00-\x7F])[^a-zA-Z0-9]
I am not looking for solution, just want to know, where I am wrong. What is the meaning of this regex pattern?
Thanks
As per my understanding (?=[\x00-\x7F])
is used to check for ASCII character and [^a-zA-Z0-9]
is used to exclude alphanumeric character. So finally it will match any ASCII character except alphanumeric. Am I right?
Upvotes: 7
Views: 886
Reputation: 1503
The regex parser goes to each character in the string and checks it with the regex.
The first part, (?=...)
, is called a 'lookahead', and it asks if the next character is whatever specified (that is, [\x00-\x7F]
). It doesn't move the character pointer.
The next part is saying that the next character is not alphanumeric, but does move the character pointer.
So it does precisely what you told it to; that is, match any non-alphanumeric ASCII character.
It does not match £
in ££££A$££0#$%
because £
is not ASCII. If you want to match ANY character that is non-alphanumeric, you're probably looking for this regex:
`[^a-zA-Z0-9]`
See http://www.regular-expressions.info/lookaround.html and other pages on the site for more info.
Upvotes: 1