Reputation: 57
How can I write a regex for alphanumeric chars allowing one or two stars and restricting the total string length to 3.
Ex : the below strings length is 3
*12 or *2* 0r *a* or *B* or **2
So, the *
symbol can occur at last or middle or at the first of *12
. Similarly, if you take the last example **2
you see more than one *
symbol and that can occur in any order of that string.
Upvotes: 0
Views: 495
Reputation:
You could always use a lookahead assertion in javascript. Its a little tricky but it's better suited to fine-tune any specific permutations.
/^(?=(?:[^*]*\*){1,2}[^*]*$)[a-zA-Z0-9*]{3}$/
Expanded:
^ # beginning of line
(?= # start lookahead
(?: # non-capture group
[^*]* # optional not '*' characters
\* # '*' character
){1,2} # end group, do 1 or 2 times
[^*]* # optional not '*' characters
$ # end of line
) # end lookahead
[a-zA-Z0-9*]{3} # back at begining of line. at this point there will
# be only 1 or 2 '*' characters in the line.
# match exactly 3 of alphanumeric characters or '*'
$ # end of line
Substitute any requirements you need.
Below is a perl test case, javascript is not my strong point.
@samps = qw(
*12 1*2 12* **1 *1* 1** ***
a*12 a1*2 a12* **a1 *a1* a1** ****
*2 *2 2* *1 1* **
);
for $teststr (@samps) {
if ($teststr =~ /^(?=(?:[^*]*\*){1,2}[^*]*$)[a-zA-Z0-9*]{3}$/) {
print "$teststr passed\n";
}
else {
print "$teststr failed\n";
}
}
Output:
*12 passed
1*2 passed
12* passed
**1 passed
*1* passed
1** passed
*** failed
a*12 failed
a1*2 failed
a12* failed
**a1 failed
*a1* failed
a1** failed
**** failed
*2 failed
*2 failed
2* failed
*1 failed
1* failed
** failed
Edit For @bozdoz
I didn't realize a string might be scraped for multiple instances of this. If so, the regex can be generalized to be used with/without delimeters.
The important thing is that this scales up very well if the requirements change to, for example 8 total characters and only 2-4 asterisks.
Examples:
no delimeters other than begin/end of string:
/
^
(?= [a-z0-9*]{3} $ )
(?:[a-z0-9]*\*){1,2} [a-z0-9]*
$
/xi
delimeter is \s, the context is single-line and global. Data is captured in group 1
/
(?:^|\s)
(?= [a-z0-9*]{3} (?:$|\s) )
( (?:[a-z0-9]*\*){1,2} [a-z0-9]* )
(?=$|\s)
/xig
delimeter is [^a-z0-9*], the context is single-line and global. Data is captured in group 1
/
(?:^|[^a-z0-9*])
(?= [a-z0-9*]{3} (?:$|[^a-z0-9*]) )
( (?:[a-z0-9]*\*){1,2} [a-z0-9]* )
(?=$|[^a-z0-9*])
/xig
Upvotes: 2
Reputation: 32893
What about this:
[a-zA-Z0-9*]{2}[^*]|[a-zA-Z0-9*][^*][a-zA-Z0-9*]|[^*][a-zA-Z0-9*]{2}
Upvotes: 0
Reputation: 3881
There are three cases:
\w[*\w]{2} # case 1, string begins with word character, last 2 can be stars
\*\w[*\w] # case 2, string begins with 1 star, last can be a star
\*{2}\w # case 3, string begins with 2 stars, last cannot be a star
Taken together, and adding the necessary start and end of string assertions, we get:
^(\w[*\w]{2}|\*\w[*\w]|\*{2}\w)$
But this solution is not quite correct because the \w
character class allows not only alphanumerics but also the _
character. Therefore, we substitute a bracketed character class [a-zA-Z0-9]
for \w
and get:
^([a-zA-Z0-9][*a-zA-Z0-9]{2}|\*[a-zA-Z0-9][*a-zA-Z0-9]|\*{2}[a-zA-Z0-9])$
Upvotes: 0
Reputation: 12860
This regex works with a lookbehind. I have tested it with PHP in codepad here.
(?<![\w*])(\w(?!\w\w)|\*(?!\*\*)){3}(?![\w*])
It basically looks for a three character word that doesn't have three word characters or three star characters. (?<![\w*])
removes words that follow a word character or a * and (?![\w*])
removes words that precede them (therefore returning ONLY three character word-segments).
Javascript doesn't exactly have lookbehinds, so I tried to adapt on a technique used here. I then came up with the following regex, tested in jsfiddle here.
/(?![\w*])(.?)(\w(?!\w\w)|\*(?!\*\*)){3}(?![\w*])/g
Hope this helps!!!!!!!! <- regex's drive me a little crazy
Upvotes: 1
Reputation: 150040
EDIT: for your updated question without the commas and optional spaces:
/^(\*[A-Z0-9]{2}|\*[A-Z0-9]\*|\*\*[A-Z0-9])$/i
Your examples don't include the alphanumeric character first, e.g., A**
, but if you want that I'm sure you can figure it out from what I've already given you.
(see below for comment on mixed case)
My original answer:
/^(\*[A-Z0-9]{2}|\*[A-Z0-9]\*|\*\*[A-Z0-9])(, *(\*[A-Z0-9]{2}|\*[A-Z0-9]\*|\*\*[A-Z0-9]))*$/i
That is the JavaScript syntax with the "i" option to make it case-insensitive. I can't be bothered looking up the Java equivalent for case-insensitive matching, but if necessary you could always change each [A-Z0-9]
part to [A-Za-z0-9]
.
Also you can use \w
instead of [A-Za-z0-9]
if you extend your definition of "alphanumeric" to include underscores.
Upvotes: 0