ewomack
ewomack

Reputation: 667

RegEx that excludes characters doesn't begin matching until 2nd character

I'm trying to create a regular expression that will include all ascii but exclude certain characters such as "+" or "%" - I'm currently using this:

^[\x00-\x7F][^%=+]+$

But I noticed (using various RegEx validators) that this pattern only begins matching with 2 characters. It won't match "a" but it will match "ab." If I remove the "[^]" section, (^[\x00-\x7F]+$) then the pattern matches one character. I've searched for other options, but so far come up with nothing. I'd like the pattern to begin matching on 1 character but also exclude characters. Any suggestions would be great!

Upvotes: 1

Views: 121

Answers (3)

Mariano
Mariano

Reputation: 6511

You could simply exclude those chars from the \x00-\x7f range (using the hex value of each char).

+----------------+
|Char|Dec|Oct|Hex|
+----------------+
| %  |37 |45 |25 |
+----------------+
| +  |43 |53 |2B |
+----------------+
| =  |61 |75 |3D |
+----------------+

Regex:

^[\x00-\x24\x26-\x2A\x2C-\x3C\x3E-\x7F]+$

DEMO

Engine-wise this is more efficient than attempting an assertion for each character.

Upvotes: 0

Sam
Sam

Reputation: 20486

Try this:

^(?:(?![%=+])[\x00-\x7F])+$

Demo.


This will loop through, make sure that the "bad" characters aren't there with a negative lookahead, then match the "good" characters, then repeat.

Upvotes: 3

anubhava
anubhava

Reputation: 785286

You can use a negative lookahead here to exclude certain characters:

^((?![%=+])[\x00-\x7F])+$

RegEx Demo

(?![%=+]) is a negative lookahead that will assert that matched character is not one of the [%=+].

Upvotes: 3

Related Questions