Reputation: 25
I want it to match a 2-to-15-character string containing only capital/lowercase letters and numbers; hyphens only allowed in-between.
/^[a-z0-9][a-z0-9\-]{0,13}[a-z0-9]$/i
Such as: a-b-c
ab
a-b
a-bcdef-g-h-i
How do I make sure that no two+ hyphens appear in a row?
Upvotes: 1
Views: 137
Reputation: 336138
You could use a negative lookahead assertion:
/^(?!.*--)[a-z0-9][a-z0-9-]{0,13}[a-z0-9]$/i
(?!.*--)
ensures that it's impossible to match --
anywhere in the string (without actually consuming any characters in the match).
Also, no need to escape the dash if it's the first or last character in a character class.
If you're not keen on lookaheads with indefinite quantifiers (like Donal Fellows), another way would be
/^[a-z0-9](?:[a-z0-9]|-(?!-)){0,13}[a-z0-9]$/i
(?:[a-z0-9]|-(?!-)){0,13}
matches either an alphanumeric character or a dash if it's not followed by another dash, repeating up to 13 times.
As for performance (checked in Python 3.2.2):
>>> import timeit
>>> timeit.timeit(stmt='r.match("a--bcdefghijklmop-qrstuvwxyz")',
... setup='import re; r=re.compile(r"^(?!.*--)[a-z0-9][a-z0-9-]{0,13}[a-z0-9]$")')
0.699529247317531
>>> timeit.timeit(stmt='r.match("a--bcdefghijklmop-qrstuvwxyz")',
... setup='import re; r=re.compile(r"^[a-z0-9](?:[a-z0-9]|-(?!-)){0,13}[a-z0-9]$")')
0.6518945164968741
>>> timeit.timeit(stmt='r.match("a-bcdefghijklmop-qrstuvwxy--z")',
... setup='import re; r=re.compile(r"^(?!.*--)[a-z0-9][a-z0-9-]{0,13}[a-z0-9]$")')
0.5857406334929749
>>> timeit.timeit(stmt='r.match("a-bcdefghijklmop-qrstuvwxy--z")',
... setup='import re; r=re.compile(r"^[a-z0-9](?:[a-z0-9]|-(?!-)){0,13}[a-z0-9]$")')
2.2273210211646415
So the (?!.*--)
is a tiny bit slower in its worst case scenario (--
early in the string, therefore lots of backtracking), but it's four times faster in its best case scenario (--
late in the string, so nearly no backtracking).
Upvotes: 5