Reputation: 6089
I know it is quite some weird goal here but for a quick and dirty fix for one of our system we do need to not filter any input and let the corruption go into the system.
My current regex for this is "\^.*
"
The problem with that is that it does not match characters as planned ... but for one match it does work. The string that make it not work is ^@jj (basically anything that has ^ ... ).
What would be the best way to not match any characters now ? I was thinking of removing the \
but only doing this will transform the "not" into a "start with" ...
Upvotes: 57
Views: 116373
Reputation:
Interesting ... the most obvious and simple variant:
~^
.
https://regex101.com/r/KhTM1i/1
requiring usually only one computation step (failing directly at the start and being computational expensive only if the matched string begins with a long series of ~
) is not mentioned among all the other answers ... for 12 years.
Upvotes: 0
Reputation: 93
tldr; The most portable and efficient regex to never match anything is $-
(end of line followed by a char)
The most reliable solution is to create an impossible regex. There are many impossible regexes but not all are as good.
First you want to avoid "lookahead" solutions because some regex engines don't support it.
Then you want to make sure your "impossible regex" is efficient and won't take too much computation steps to match... nothing.
I found that $-
has a constant computation time ( O(1) ) and only takes two steps to compute regardless of the size of your text (https://regex101.com/r/yjcs1Z/3).
$^
and $.
both take 36 steps to compute -> O(1)\b\B
takes 1507 steps on my sample and increase with the number of character in your string -> O(n)If your regex engine accepts it, the best and simplest regex to never match anything might be: an empty regex
.
Upvotes: 9
Reputation: 4911
Another very well supported and fast pattern that would fail to match anything that is guaranteed to be constant time:
$unmatchable pattern
$anything goes here
etc.
$
of course indicates the end-of-line. No characters could possibly go after $
so no further state transitions could possibly be made. The additional advantage are that your pattern is intuitive, self-descriptive and readable as well!
Upvotes: 15
Reputation: 383746
A simple and cheap regex that will never match anything is to match against something that is simply unmatchable, for example: \b\B
.
It's simply impossible for this regex to match, since it's a contradiction.
\B
is the negated version of \b
. \B
matches at every position where \b
does not. Upvotes: 64
Reputation: 159905
Instead of trying to not match any characters, why not just match all characters? ^.*$
should do the trick. If you have to not match any characters then try ^\j$
(Assuming of course, that your regular expression engine will not throw an error when you provide it an invalid character class. If it does, try ^()$
. A quick test with RegexBuddy suggests that this might work.
Upvotes: 1
Reputation: 1150
^ is only not when it's in class (such as [^a-z] meaning anything but a-z). You've turned it into a literal ^ with the backslash.
What you're trying to do is [^]*, but that's not legal. You could try something like
" {10000}"
which would match exactly 10,000 spaces, if that's longer than your maximum input, it should never be matched.
Upvotes: 0
Reputation: 75804
You want to match nothing at all? Neg lookarounds seems obvious, but can be slow, perhaps ^$
(matches empty string only) as an alternative?
Upvotes: 0
Reputation: 41378
The ^
character doesn't mean "not" except inside a character class ([]
). If you want to not match anything, you could use a negative lookahead that matches anything: (?!.*)
.
Upvotes: 85