Reputation: 1576
I'm trying to build a complex regex with the following constraints:
1. My string can only be composed of:
"Regular" alphanumeric characters : a-zA-Z0-9
4 specials characters : space . _ -
2. Length has to be between 3 and 25
So far it's quite easy but then it gets complicated :
3. There cannot be 2 consecutive special characters, unless the 1st one is a space and the 2nd one isn't a space. Logical consequence : there cannot be 3 consecutive special characters
4 The string cannot start or end with a space
I'm especially struggling with 3. Any help/hint would be much appreciated.
Examples:
" lkjsdi1SD" => FALSE (starts with a space)
"-lkjsdi1SD" => TRUE
"lkjsd -i1SD " => FALSE (ends with a space)
".Dg5 -lkjsdi1SD" => TRUE
"jhv5675gjjvghHJHvg655775vfFVHFJFf445576JHFFfhd12" => FALSE (too long)
"jhv 12" => FALSE (two consecutive spaces)
"as" => FALSE (too short)
"a r" => TRUE
Upvotes: 0
Views: 153
Reputation: 627044
I suggest using:
^ # Start of string
(?=.{3,25}$) # The total string length is from 3 to 25
[._-]? # An optional . _ or - (? means "match 1 or 0 times")
[a-zA-Z0-9]+ # one or more alphanumeric symbols
(?: # Zero or more sequences of:
(?:[._-]|[ ][._-]?) # one . _ or - OR a space followed with an optional . _ or -
[a-zA-Z0-9]+ # one or more alphanumerics
)* # (here * defines zero or more times)
[._-]? # one optional . _ or -
$ # End of string
See the inline description for each part (I used /x
VERBOSE (or free-space) modifier to enable comments that is helpful to keep long patterns readable).
See the regex demo
More pattern details
^
- start of string anchor, the regex engine will only look for the whole pattern at the string start. Thus, if there is a space at the start, no match will be returned as [a-zA-Z0-9]+
, the first obligatory subpattern, requires an alphanumeric, and [._-]?
(a character class that matches one or zero .
, _
, or -
(the ?
is a quantifier matching one or zero occurrences of the quantified subpattern) only allows 1 of these 3 characters before the first alphanumeric.(?=.{3,25}$)
is a positive lookahead anchored at the start, that requires at least 3 and at most 25 characters other than a newline (.
matches any char other than a LF if /s
modifier is not defined) from start till end ($
is the string end anchor that matches at the end of string or before the final char that is a newline character, replace with \z
if you want to disallow matching a string with a newline symbol at the end).
The {3,25}
is a limiting quantifier that allows matching min
to max
amount of characters conforming to the subpattern quantified. Note that a lookahead does not consume the text, i.e. the regex engine returns to the place where it starts matching the lookahead pattern with the true
or false
result, and if true
, goes on matching the rest of the pattern.[._-]?
- an optional single char, one of the defined chars in the character class (see explanation above)[a-zA-Z0-9]+
- one or more (I wrote "1+") characters (the +
quantifier matches 1 or more occurrences) that are in the ranges defined in the character class.(?:(?:[._-]|[ ][._-]?)[a-zA-Z0-9]+)*
- is a non-capturing group used only for grouping subpatterns (to match them consecutively) that can match one or more (as the *
stands after it) sequences of (?:[._-]|[ ][._-]?)[a-zA-Z0-9]+
:(?:[._-]|[ ][._-]?)
- either a .
, _
, or -
, OR (due to the |
alternation operator) the space (I put the space into a character class [ ]
because I used the /x
VERBOSE modifier to introduce newline formatting and comments into the pattern, you may use a regular space if you do not use the /x
modifier) followed with .
, _
, or -
.[a-zA-Z0-9]+
- 1 or more (due to +
) alphanumerics.Upvotes: 2
Reputation: 40681
Try using this:
^(?:[a-zA-Z0-9]|[._-](?![ ._-]))(?:[a-zA-Z0-9 ]|[._-](?![ ._-])){1,23}[a-zA-Z0-9._-]$
The part [._-](?![ ._-])
means "match [._-]
if it's not followed by [ ._-]
.
In general you can look into lookarounds
Upvotes: 1