Reputation: 237

Change Regex to only accept specified characters?

I currently have the regex in a javascript file as:

function (value) {
    regex = new RegExp("[\<|\>|\"|\'|\%|\;|\(|\)|\&|\_|\.]", "i");
    return !regex.test(value);
}

Rather than specifying what characters are not allowed, how can I state what characters are allowed? The characters I want are a-z A-Z 0-9 (and also the actual character "-" but not at the start or end, only inbetween). Thanks in advanced.

Upvotes: 0

Answers (3)

stema

Reputation: 93086

Try this

regex = new RegExp("^(?!-)[a-z0-9-]*[a-z0-9]$", "i");

^ anchors to the start of the string

(?!-) a negative look ahead ensuring the string does not start with a dash

[a-z0-9-]* 0 or more of the characters inside the class

[a-z0-9]$ ends with a character from the class

See this regex here on Regexr

Update

This is a variant using two lookaheads, this would also accept the empty string

^(?!-)(?!.*-$)[a-z0-9-]*$

See it on Regexr

Update 2: Minimum 6 characters

^(?!-)[a-z0-9-]{5,}[a-z0-9]$

See it on Regexr

To counter the comment

It is actually too complicated a regex -- and definitely slower than my regex

I did a small benchmark (in Perl, shouldn't be a big deal)

sub WithLookahead($) {
    my $string = shift;

    return $string =~ /^(?!-)[a-z0-9-]*[a-z0-9]$/i;
}

sub WithOutLookahead($) {
    my $string = shift;

    return ( $string =~ /^[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*$/
          && length($string) >= 6 );
}

sub BenchmarkLookahead($) {
    use Benchmark;
    my $testString = shift;

    my $t0 = Benchmark->new;
    for ( 0 .. 10000000 ) {
        my $result = WithLookahead($testString);
    }
    my $t1 = Benchmark->new;

    my $t2 = Benchmark->new;
    for ( 0 .. 10000000 ) {
        my $result = WithOutLookahead($testString);
    }
    my $t3 = Benchmark->new;

    my $tdWith    = timediff( $t1, $t0 );
    my $tdWithOut = timediff( $t3, $t2 );
    print "the code with Lookahead and test string \"$testString\" took:",    timestr($tdWith),    "\n";
    print "the code without Lookahead and test string \"$testString\" took:", timestr($tdWithOut), "\n";
}

Result

the code with Lookahead and test string "Foo-Bar" took:16 wallclock secs (14.94 usr + 0.00 sys = 14.94 CPU)
the code without Lookahead and test string "Foo-Bar" took:18 wallclock secs (17.50 usr + 0.02 sys = 17.52 CPU)
the code with Lookahead and test string "-Foo-Bar" took:13 wallclock secs (12.03 usr + 0.00 sys = 12.03 CPU)
the code without Lookahead and test string "-Foo-Bar" took:14 wallclock secs (13.44 usr + 0.00 sys = 13.44 CPU)
the code with Lookahead and test string "Foo-Bar-" took:17 wallclock secs (15.28 usr + 0.00 sys = 15.28 CPU)
the code without Lookahead and test string "Foo-Bar-" took:23 wallclock secs (21.61 usr + 0.02 sys = 21.63 CPU)
the code with Lookahead and test string "Foo" took:14 wallclock secs (13.70 usr + 0.00 sys = 13.70 CPU)
the code without Lookahead and test string "Foo" took:19 wallclock secs (17.09 usr + 0.02 sys = 17.11 CPU)

So overall my regex with the negative lookahead is a bit quicker than the combination of a simpler regex in combination with a external length check. But I needed to call each code 10000000 times to get significant results, so I don't think its a performance decision which one to use.

Upvotes: 1

fge

Reputation: 121860

regex = new RegExp("^[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*$");

Again the classical "normal* (special normal*)*" pattern ;)

The function body becomes:

function (value) {
    regex = new RegExp("^[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*$");
    return regex.test(value) && value.length >= 6;
}

edit: made grouping non capturing since no capture is done here

Upvotes: 3

a'r

Reputation: 37029

The restriction on not having the dash at the start or end makes the regex a little more complicated. The regex below first matches a single character, then it optionally matches zero or more characters including dash before ending on a non-dash character.

/^[a-zA-Z0-9]([a-zA-Z0-9-]*[a-zA-Z0-9])?$/

Upvotes: 1

Change Regex to only accept specified characters?

Answers (3)

Related Questions