Reputation: 121
I need an regex, which will match the next urls:
(http|https)://(www)<my domain here>
(http|https)://(www)<my domain here>/page1
(http|https)://(www)<my domain here>/page1/.../
(http|https)://(www)<my domain here>/page1/...?a=b
(http|https)://(www)<my domain here>/page1/...?a=b&c=d...
I have one regex, but I don't know how to edit it
^(http:\/\/|https:\/\/)?(www.)?([a-zA-Z0-9]+).[a-zA-Z0-9]*.[a-z]{3}.?([a-z]+)?$
Upvotes: 1
Views: 5041
Reputation: 125
Try use this:
^(http|https):\/\/(www).([a-z\.]*)?(\/[a-z1-9\/]*)*\??([\&a-z1-9=]*)?
The explanation I get in site Regular expressions 101.
/^(http|https):\/\/(www).([a-z\.]*)?(\/[a-z1-9\/]*)*\??([\&a-z1-9=]*)?/
^ assert position at start of the string
1st Capturing group (http|https)
1st Alternative: http
http matches the characters http literally (case sensitive)
2nd Alternative: https
https matches the characters https literally (case sensitive)
: matches the character : literally
\/ matches the character / literally
\/ matches the character / literally
2nd Capturing group (www)
www matches the characters www literally (case sensitive)
. matches any character (except newline)
3rd Capturing group ([a-z\.]*)?
Quantifier: ? Between zero and one time, as many times as possible, giving back as needed [greedy]
[a-z\.]* match a single character present in the list below
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
a-z a single character in the range between a and z (case sensitive)
\. matches the character . literally
4th Capturing group (\/[a-z1-9\/]*)*
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
\/ matches the character / literally
[a-z1-9\/]* match a single character present in the list below
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
a-z a single character in the range between a and z (case sensitive)
1-9 a single character in the range between 1 and 9
\/ matches the character / literally
\?? matches the character ? literally
Quantifier: ? Between zero and one time, as many times as possible, giving back as needed [greedy]
5th Capturing group ([\&a-z1-9=]*)?
Quantifier: ? Between zero and one time, as many times as possible, giving back as needed [greedy]
[\&a-z1-9=]* match a single character present in the list below
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
\& matches the character & literally
a-z a single character in the range between a and z (case sensitive)
1-9 a single character in the range between 1 and 9
= the literal character =
Upvotes: 0
Reputation: 13640
You can use the following:
^(http:\/\/|https:\/\/)?(www.)?([a-zA-Z0-9]+).[a-zA-Z0-9]*.[a-z]{3}.?([a-z]+)?(\/[a-z0-9])*(\/?|(\?[a-z0-9]=[a-z0-9](&[a-z0-9]=[a-z0-9]*)?))$
For a specific domain name:
^(http:\/\/|https:\/\/)?(www.)?example\.com(\/?|(\?[a-z0-9]=[a-z0-9](&[a-z0-9]=[a-z0-9]*)?))$
Edit: Validate domain and get Url:
(http:\/\/|https:\/\/)?(www.)?example\.com\S*
Upvotes: 2