Reputation: 237
I currently have the regex in a javascript file as:
function (value) {
regex = new RegExp("[\<|\>|\"|\'|\%|\;|\(|\)|\&|\_|\.]", "i");
return !regex.test(value);
}
Rather than specifying what characters are not allowed, how can I state what characters are allowed? The characters I want are a-z A-Z 0-9 (and also the actual character "-" but not at the start or end, only inbetween). Thanks in advanced.
Upvotes: 0
Views: 300
Reputation: 92986
Try this
regex = new RegExp("^(?!-)[a-z0-9-]*[a-z0-9]$", "i");
^
anchors to the start of the string
(?!-)
a negative look ahead ensuring the string does not start with a dash
[a-z0-9-]*
0 or more of the characters inside the class
[a-z0-9]$
ends with a character from the class
See this regex here on Regexr
Update
This is a variant using two lookaheads, this would also accept the empty string
^(?!-)(?!.*-$)[a-z0-9-]*$
Update 2: Minimum 6 characters
^(?!-)[a-z0-9-]{5,}[a-z0-9]$
To counter the comment
It is actually too complicated a regex -- and definitely slower than my regex
I did a small benchmark (in Perl, shouldn't be a big deal)
sub WithLookahead($) {
my $string = shift;
return $string =~ /^(?!-)[a-z0-9-]*[a-z0-9]$/i;
}
sub WithOutLookahead($) {
my $string = shift;
return ( $string =~ /^[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*$/
&& length($string) >= 6 );
}
sub BenchmarkLookahead($) {
use Benchmark;
my $testString = shift;
my $t0 = Benchmark->new;
for ( 0 .. 10000000 ) {
my $result = WithLookahead($testString);
}
my $t1 = Benchmark->new;
my $t2 = Benchmark->new;
for ( 0 .. 10000000 ) {
my $result = WithOutLookahead($testString);
}
my $t3 = Benchmark->new;
my $tdWith = timediff( $t1, $t0 );
my $tdWithOut = timediff( $t3, $t2 );
print "the code with Lookahead and test string \"$testString\" took:", timestr($tdWith), "\n";
print "the code without Lookahead and test string \"$testString\" took:", timestr($tdWithOut), "\n";
}
Result
the code with Lookahead and test string "Foo-Bar" took:16 wallclock secs (14.94 usr + 0.00 sys = 14.94 CPU)
the code without Lookahead and test string "Foo-Bar" took:18 wallclock secs (17.50 usr + 0.02 sys = 17.52 CPU)
the code with Lookahead and test string "-Foo-Bar" took:13 wallclock secs (12.03 usr + 0.00 sys = 12.03 CPU)
the code without Lookahead and test string "-Foo-Bar" took:14 wallclock secs (13.44 usr + 0.00 sys = 13.44 CPU)
the code with Lookahead and test string "Foo-Bar-" took:17 wallclock secs (15.28 usr + 0.00 sys = 15.28 CPU)
the code without Lookahead and test string "Foo-Bar-" took:23 wallclock secs (21.61 usr + 0.02 sys = 21.63 CPU)
the code with Lookahead and test string "Foo" took:14 wallclock secs (13.70 usr + 0.00 sys = 13.70 CPU)
the code without Lookahead and test string "Foo" took:19 wallclock secs (17.09 usr + 0.02 sys = 17.11 CPU)
So overall my regex with the negative lookahead is a bit quicker than the combination of a simpler regex in combination with a external length check. But I needed to call each code 10000000 times to get significant results, so I don't think its a performance decision which one to use.
Upvotes: 1
Reputation: 121710
regex = new RegExp("^[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*$");
Again the classical "normal* (special normal*)*" pattern ;)
The function body becomes:
function (value) {
regex = new RegExp("^[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*$");
return regex.test(value) && value.length >= 6;
}
edit: made grouping non capturing since no capture is done here
Upvotes: 3
Reputation: 36999
The restriction on not having the dash at the start or end makes the regex a little more complicated. The regex below first matches a single character, then it optionally matches zero or more characters including dash before ending on a non-dash character.
/^[a-zA-Z0-9]([a-zA-Z0-9-]*[a-zA-Z0-9])?$/
Upvotes: 1