user1385257
user1385257

Reputation: 29

Javascript - Regex for letters,numbers,underscore comma-seperated tags

I'm new to regex and javascript and I was wondering if anyone knew what the regex would be for detecting whether or not if an input field contained the following type of format:

At least one alphanumeric and underscore( _ ) tag which can't contain spaces (e.g "test" and "test_" but not "test test")

Each tag seperated by a single comma (e.g "word1,word2,word_3, _word_4" but not "word1,,word2,word_3, _word_4) and any others symbos to be invalid (like ;!"'@#%^&*()-+=.>

An example of what I mean is this, these would be valid tags:

something1,something_2,something_something,something And these would be invalid tags:

something1%,something%2^,!something%_&something,(*)something@+

It should also be able to accept just a single tag, as multiples tags too!!

Thanks.

Upvotes: 1

Views: 4192

Answers (2)

soimon
soimon

Reputation: 2580

Presuming you want to accept both uppercase and lowercase characters:

^[a-zA-Z0-9_]+(,[a-zA-Z0-9_]+)*$

The mentioned site has great information about regular expressions, I recommend reading through it. For now a short explanation:

^ means beginning of the string, so that no other (possibly invalid) characters can precede it. Between [ and ] is a character class: specifying what characters may follow. [ABC] for example means an A, a B or a C. You can also specify ranges like [A-E], which means an A, B, C, D or E.

In the regular expression above I specify the range a to z, A to Z (uppercase), 0 to 9 and the single character _. The + means that the character, a group or a character from the character class preceding it must appear at least once or more.

The ( and ) group a part of the regular expression. In this case they group the , (for the comma-separated list you wanted) and a repetition of the expression so far. The * means (like the +) that the group preceding it may appear many times, but with the difference that the * makes it optional.

So, in short: this expression allows tags consisting of at least one or more characters in the range a-z, A-Z, 0-9 or the character _, optionally followed by more tags that begin with a ,, specifying the requirements for a comma-separated list :)

Upvotes: 2

Joey
Joey

Reputation: 354774

A single tag would be matched by

[a-zA-Z0-9_]+

This is a character class containing Latin letters in upper- and lowercase as well as digits and the underscore. This can usually be shortened to

\w+

if you know that your RE engine won't handle Unicode (this is the case for JavaScript). I'll continue with \w+ for now, though.

You can match multiple tags by selecting a single tag and a possibly zero number of comma + tag:

\w+(,\w+)*

If you want to validate a complete string, you should put anchors for start of string and end of string around the expression:

^\w+(,\w+)*$

Upvotes: 1

Related Questions