Braj
Braj

Reputation: 46851

match ASCII characters except alphanumeric

I have a question that come to my mind when I answered this post to match ASCII characters except alphanumeric.

This is what I have tried but it's not correct.

(?=[\x00-\x7F])[^a-zA-Z0-9]

regex101 demo

I am not looking for solution, just want to know, where I am wrong. What is the meaning of this regex pattern?

Thanks


As per my understanding (?=[\x00-\x7F]) is used to check for ASCII character and [^a-zA-Z0-9] is used to exclude alphanumeric character. So finally it will match any ASCII character except alphanumeric. Am I right?

Upvotes: 7

Views: 886

Answers (1)

oink
oink

Reputation: 1503

The regex parser goes to each character in the string and checks it with the regex.

The first part, (?=...), is called a 'lookahead', and it asks if the next character is whatever specified (that is, [\x00-\x7F]). It doesn't move the character pointer.

The next part is saying that the next character is not alphanumeric, but does move the character pointer.

So it does precisely what you told it to; that is, match any non-alphanumeric ASCII character.

It does not match £ in ££££A$££0#$% because £ is not ASCII. If you want to match ANY character that is non-alphanumeric, you're probably looking for this regex:

`[^a-zA-Z0-9]`

See http://www.regular-expressions.info/lookaround.html and other pages on the site for more info.

Upvotes: 1

Related Questions