Math Student
Math Student

Reputation: 567

Regular expression to match letters and digits, where some symbols can appear between them

I am trying to create a regular expression to match letters and digits, where .,- and _ can appear between them.

  1. Examples of valid matches: "stephan", "mike03", "s.johnson", "st_steward", "john-johnson", "12345", "324_231351231".
  2. Examples of invalid users: ''--123", ".....", "john_-", "_steve", ".info".

I came up with this expression

[A-Za-z0-9.\-_]+

but it will also match things like stefan_johnson_ which should not be matched because _ can only appear between the letters and the digits. The same holds if we have _ or any of the other already mentioned symbols in the beginning.

Upvotes: 0

Views: 653

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626896

You can use

^[A-Za-z0-9]+(?:[._-][A-Za-z0-9]+)*$

See the regex demo. Details:

  • ^ - start of string
  • [A-Za-z0-9]+ - one or more ASCII letters or digits
  • (?:[._-][A-Za-z0-9]+)* - zero or more occurrences of
    • [._-] - a ., _ or -
    • [A-Za-z0-9]+ - one or more ASCII letters or digits
  • $ - end of string.

It may turn out you need to support any Unicode letters or digits, then use

^[\p{L}\p{N}]+(?:[._-][\p{L}\p{N}]+)*$

where \p{L} matches any Unicode (base) letter and \p{N} matches any Unicode digit.

Replace $ with \z if you do not want to allow a trailing \n (line feed) char.

Also, see my "Validating strings with comma-separated values (with no leading/trailing separators)" YT video explaining this kind of validation technique.

Upvotes: 2

Related Questions