Reputation: 31546
I am writing a regular expression which should match URL of this type
http(s)://a.b.c.domain.company.com(:8000)
protocol can be http and https and port is optional
I have written this
$reg = "^(http|https)(\://)([a-zA-Z0-9\-\.]){6,}(\:[0-9]*)?\/?"
$url1 = "http://uat.upm.goal.services.ps.com"
$url2 = "http://uat.upm.goal.services.ps.com:9000/"
$url3 = "http://uat.upm.goal.services.ps.com:9000?name=foo"
$flag1 = $url1 -Match $reg
$flag2 = $url2 -Match $reg
$flag3 = $url3 -Match $reg
echo $flag1
echo $flag2
echo $flag3
I desire that $url1 and $url2 match the regex... but $url3 should fail the match (becuase it comtains commands). I want the URL to end at either .com OR .com:8000 OR .com:8000/
I don't want anything after the (optional) port and /.
Upvotes: 0
Views: 64
Reputation: 915
try "^(http|https)(\://)([a-zA-Z0-9\-\.]){6,}(\:[0-9]*)?\/?"
for urls without query part use this:
"^(http|https)(\://)([a-zA-Z0-9\-\.]){6,}(\:[0-9]*)?\/?$"
$
means the end of line / string
I removed the ^
at the end since it's a special char meaning the beginning of the line
I changed {6}
to {6,}
which mean that there has to be at least 6 chars from the group
I tested this in awk and it matches:
awk='/^(http|https)(\:\/\/)([a-zA-Z0-9\-\.]){6,}(\:[0-9]*)?\/?$/'
echo "http://u.ucm.project.services.ps.com" | awk "$awk {print\$0}"
echo "https://z.ucm.project.services.ps.com:22400/" | awk "$awk {print\$0}"
echo "http://uat.upm.goal.services.ps.com:9000?name=foo" | awk "$awk {print\$0}"
as you wanted, only the first two match.
Upvotes: 1
Reputation: 16553
You are missing +
after the letter groups. So ([a-zA-Z0-9\-\.]){6}
should probably be ([a-zA-Z0-9\-\.]+){6}
, so that at least one character and possibly more are there.
Also, the {6}
doesn't do what you expect (match domain with 6 dots) because of the way you wrote it. Either remove it, and allow any number of dot-seperated domain parts or change it to something like:
([a-zA-Z0-9\-]+\.){6}
Upvotes: 1