Sevak.Avet
Sevak.Avet

Reputation: 121

Java regex for URL

I need an regex, which will match the next urls:

(http|https)://(www)<my domain here>
(http|https)://(www)<my domain here>/page1
(http|https)://(www)<my domain here>/page1/.../
(http|https)://(www)<my domain here>/page1/...?a=b
(http|https)://(www)<my domain here>/page1/...?a=b&c=d...

I have one regex, but I don't know how to edit it

^(http:\/\/|https:\/\/)?(www.)?([a-zA-Z0-9]+).[a-zA-Z0-9]*.[a-z]{3}.?([a-z]+)?$

Upvotes: 1

Views: 5041

Answers (3)

Mateus Milanez
Mateus Milanez

Reputation: 125

Try use this:

^(http|https):\/\/(www).([a-z\.]*)?(\/[a-z1-9\/]*)*\??([\&a-z1-9=]*)?

The explanation I get in site Regular expressions 101.

/^(http|https):\/\/(www).([a-z\.]*)?(\/[a-z1-9\/]*)*\??([\&a-z1-9=]*)?/
    ^ assert position at start of the string
    1st Capturing group (http|https)
        1st Alternative: http
            http matches the characters http literally (case sensitive)
        2nd Alternative: https
            https matches the characters https literally (case sensitive)
    : matches the character : literally
    \/ matches the character / literally
    \/ matches the character / literally
    2nd Capturing group (www)
        www matches the characters www literally (case sensitive)
    . matches any character (except newline)
    3rd Capturing group ([a-z\.]*)?
        Quantifier: ? Between zero and one time, as many times as possible, giving back as needed [greedy]
        [a-z\.]* match a single character present in the list below
        Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
        a-z a single character in the range between a and z (case sensitive)
        \. matches the character . literally
    4th Capturing group (\/[a-z1-9\/]*)*
        Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
       \/ matches the character / literally
        [a-z1-9\/]* match a single character present in the list below
            Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
            a-z a single character in the range between a and z (case sensitive)
            1-9 a single character in the range between 1 and 9
            \/ matches the character / literally
    \?? matches the character ? literally
        Quantifier: ? Between zero and one time, as many times as possible, giving back as needed [greedy]
    5th Capturing group ([\&a-z1-9=]*)?
        Quantifier: ? Between zero and one time, as many times as possible, giving back as needed [greedy]
       [\&a-z1-9=]* match a single character present in the list below
            Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
            \& matches the character & literally
            a-z a single character in the range between a and z (case sensitive)
            1-9 a single character in the range between 1 and 9
            = the literal character =

Upvotes: 0

karthik manchala
karthik manchala

Reputation: 13640

You can use the following:

^(http:\/\/|https:\/\/)?(www.)?([a-zA-Z0-9]+).[a-zA-Z0-9]*.[a-z]{3}.?([a-z]+)?(\/[a-z0-9])*(\/?|(\?[a-z0-9]=[a-z0-9](&[a-z0-9]=[a-z0-9]*)?))$

For a specific domain name:

^(http:\/\/|https:\/\/)?(www.)?example\.com(\/?|(\?[a-z0-9]=[a-z0-9](&[a-z0-9]=[a-z0-9]*)?))$

Edit: Validate domain and get Url:

(http:\/\/|https:\/\/)?(www.)?example\.com\S*

Upvotes: 2

jacks
jacks

Reputation: 4753

Try this:

^https?://[^/@]*\.domain\.com(/.*)?$

Upvotes: 0

Related Questions