Reputation: 12772
I would like to be able to use a single regex (if possible) to require that a string fits [A-Za-z0-9_]
but doesn't allow:
Valid
test_0123
t0e1s2t3
0123_test
te0_s1t23
t_t
Invalid
t__t
____
01230123
_0123
_test
_test123
test_
test123_
The purpose of this is to filter usernames for a website I'm working on. I've arrived at the rules for specific reasons.
Usernames with only numbers and/or symbols could cause problems with routing and database lookups. The route for /users/#{id}
allows id
to be either the user's id or user's name. So names and ids shouldn't be able to collide.
_test
looks wierd and I don't believe it's valid subdomain i.e. _test.example.com
I don't like the look of t__t
as a subdomain. i.e. t__t.example.com
Upvotes: 3
Views: 5329
Reputation: 526573
This matches exactly what you want:
/\A(?!_)(?:[a-z0-9]_?)*[a-z](?:_?[a-z0-9])*(?<!_)\z/i
[a-z]
in the middle).(?!_)
and (?<!_)
at the beginning and end).Edit: In fact, you probably don't even need the lookahead/lookbehinds due to how the rest of the regex works - the first ?:
parenthetical won't allow an underscore until after an alphanumeric, and the second ?:
parenthetical won't allow an underscore unless it's before an alphanumeric:
/\A(?:[a-z0-9]_?)*[a-z](?:_?[a-z0-9])*\z/i
Should work fine.
Upvotes: 8
Reputation: 15488
The question asks for a single regexp, and implies that it should be a regexp that matches, which is fine, and answered by others. For interest, though, I note that these rules are rather easier to state directly as a regexp that should not match. I.e.:
x !~ /[^A-Za-z0-9_]|^_|_$|__|^\d+$/
You can't use it this way in a Rails validates_format_of, but you could put it in a validate method for the class, and I think you'd have much better chance of still being able to make sense of what you meant, a month or a year from now.
Upvotes: 1
Reputation: 75222
/^(?![\d_]+$)[A-Za-z0-9]+(?:_[A-Za-z0-9]+)*$/
Your question is essentially the same as this one, with the added requirement that at least one of the characters has to be a letter. The negative lookahead - (?![\d_]+$)
- takes care of that part, and is much easier (both to read and write) than incorporating it into the basic regex as some others have tried to do.
Upvotes: 0
Reputation: 9172
This doesn't block "__", but it does get the rest:
([A-Za-z]|[0-9][0-9_]*)([A-Za-z0-9]|_[A-Za-z0-9])*
And here's the longer form that gets all your rules:
([A-Za-z]|([0-9]+(_[0-9]+)*([A-Za-z|_[A-Za-z])))([A-Za-z0-9]|_[A-Za-z0-9])*
dang, that's ugly. I'll agree with Telemachus, that you probably shouldn't do this with one regex, even though it's technically possible. regex is often a pain for maintenance.
Upvotes: 1
Reputation: 10795
What about:
/^(?=[^_])([A-Za-z0-9]+_?)*[A-Za-z](_?[A-Za-z0-9]+)*$/
It doesn't use a back reference.
Edit:
Succeeds for all your test cases. Is ruby compatible.
Upvotes: 2
Reputation: 60398
(?=.*[a-zA-Z].*)^[A-Za-z0-9](_?[A-Za-z0-9]+)*$
This one works.
Look ahead to make sure there's at least one letter in the string, then start consuming input. Every time there is an underscore, there must be a number or a letter before the next underscore.
Upvotes: 0
Reputation: 8963
Here you go:
^(([a-zA-Z]([^a-zA-Z0-9]?[a-zA-Z0-9])*)|([0-9]([^a-zA-Z0-9]?[a-zA-Z0-9])*[a-zA-Z]+([^a-zA-Z0-9]?[a-zA-Z0-9])*))$
If you want to restrict the symbols you want to accept, simply change all [^a-zA-Z0-9] with [] containing all allowed symbols
Upvotes: 0
Reputation: 1632
[A-Za-z][A-Za-z0-9_]*[A-Za-z]
That would work for your first two rules (since it requires a letter at the beginning and end for the second rule, it automatically requires letters).
I'm not sure the third rule is possible using regexes.
Upvotes: -2
Reputation: 19705
I'm sure that you could put all this into one regular expression, but it won't be simple and I'm not sure why insist on it being one regex. Why not use multiple passes during validation? If the validation checks are done when users create a new account, there really isn't any reason to try to cram it into one regex. (That is, you will only be dealing with one item at a time, not hundreds or thousands or more. A few passes over a normal sized username should take very little time, I would think.)
First reject if the name doesn't contain at least one number; then reject if the name doesn't contain at least one letter; then check that the start and end are correct; etc. Each of those passes could be a simple to read and easy to maintain regular expression.
Upvotes: 2