Cayo Emilio
Cayo Emilio

Reputation: 99

regex to find valide URL with or without www, including dot but excluding double dots

I am trying to find a regex that matches URLs that include or not 'www', is followed by valide strings that can indlude dots, but not two or more consecutive dots. For sake of simplicity, I am limiting the problem only to URLs with subdomains and with .com domain. For example:

www.aBC.com      #MATCH
abc.com          #MATCH
a_bc.de8f.com    #MATCH
a.com            #MATCH
abc              #NO MATCH
abc..com         #NO MATCH

The closest I got with my regex is \w+.[\w]+.com but this does not match a simple "a.com". I am using "\w" instead of "." because otherwise I don't know how to avoid two or more dots in sequence.

Any help is appreciated.

Upvotes: 1

Views: 315

Answers (1)

Ryszard Czech
Ryszard Czech

Reputation: 18621

Use

(?:\w+\.)*\w+\.com

See regex proof.

EXPLANATION

-------------------------------------------------------------------------------
  (?:                      group, but do not capture (0 or more times
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    \w+                      word characters (a-z, A-Z, 0-9, _) (1 or
                             more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    \.                       '.'
--------------------------------------------------------------------------------
  )*                       end of grouping
--------------------------------------------------------------------------------
  \w+                      word characters (a-z, A-Z, 0-9, _) (1 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  \.                       '.'
--------------------------------------------------------------------------------
  com                      'com'

Upvotes: 2

Related Questions