tkolleh
tkolleh

Reputation: 423

Why does the regex not capture 'www.'

I'm creating a simple (I thought it would be simple) regex expression to capture ulr information in groups. Everything lines up except when I use a web address that has 'www.'

Expression:

((https?):\/\/(?:www\.)?([\w\.\-\:]+)\/(.+))

Test URLs:

http://11.111.111.1:1010/nexus-2.3.1/service/local/artifact/maven/content?r=fake_release&g=com.fake&a=com.rake.fake.soap.webapp&v=LATEST&e=war
https://hello-ci.fake-re.com/jenkins/view/RAS/job/RAS_Designtime_Master/site/com.rake.fake.ras.documentation/kwl/Assessment-faker-gage.html
https://regex101.com/#python
https://www.google.com
http://www.apple.com

Why do I not get a match on https://www.google.com nor http://www.apple.com

Note: This regular expression is for a python application

Upvotes: 3

Views: 46

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

Those URLs are not matched because of the obligatory /. Make that part optional with a non-capturing group and ? quantifier:

((https?):\/\/(?:www\.)?([\w\.\-\:]+)(?:\/(.+))?)
                                     ^^^      ^^

See regex demo

Upvotes: 4

Related Questions