Cameron Ball
Cameron Ball

Reputation: 4108

Can this regular expression be improved?

Here's what I want to match:

this_is.ok.com
this_is.another_valid.domain.com

And here are some strings I don't want to match:

this_one.is_not_ok.com
not_ok.com
also.not_ok

i.e., there can be underscores in any part except for the ultimate or penultimate part.

The regex I came up with:

^([a-zA-Z0-9-_]{0,63}?\.)*([a-zA-Z0-9-]{0,63}?\.){1}([a-zA-Z0-9-]{0,63}?){1}$

It does seem to work, but I feel like it could be better.

NB: Please no discussions about underscores in domain names. Just comment on the regex.

Upvotes: 0

Views: 51

Answers (2)

Bohemian
Bohemian

Reputation: 424983

Assuming "improved" means "shortened":

^(\w+\.)+\p{L}+\.\p{L}+$

See live demo.

Upvotes: 0

anubhava
anubhava

Reputation: 784998

You can use this refactored and smaller regex:

^([\w-]{1,63}?\.)*([a-zA-Z0-9-]{1,63}\.)([a-zA-Z0-9-]{2,63})$

RegEx Demo

Changes are:

  • \w - short cut for [a-zA-Z0-9_]
  • Must keep hyphen at first or last position in character class
  • {1} is unnecessary and should be taken out
  • {0,63} should be at least {1,63}

Note that this refactored regex takes 106 steps on regex101 site as compared to 124 steps taken by your regex.

Upvotes: 1

Related Questions