Reputation: 223
I have spent over 4 hours trying to find a regex patter to my php code without luck.
I have a string with html code. It has lot of urls formats like:
example.com
http://example.com
http://www.example.com
http://example.com/some.php
http://example.com/some.php?var1=1
http://example.com/some.php?var1=1&var2=2
etc.
I have the following php code working in part:
preg_match_all('/\b(?:(?:https?|ftp|file):\/\/|www\.|ftp\.)[-A-Z0-9+&@#\/%=~_|$?!:,.]*[A-Z0-9+&@#\/%=~_|$]/i', $content, $result, PREG_PATTERN_ORDER);
The only thing I need is ALSO capture urls with multiple query strings using "&" I get them, but not in full, I only receive things like:
http://example.com/asdad.php?var1=1&
The left is lost.
Can someone help me adding the part lost to the pattern?
Thanks so much in advance.
Upvotes: 7
Views: 10438
Reputation: 223
Well. Finally I got it:
The final regex code is:
$regex = "/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i";
It works.
Upvotes: 11
Reputation: 4397
Check these pattern which can be used for any URL type
$regex = "((https?|ftp)\:\/\/)?"; // Checking scheme
$regex .= "([a-z0-9-.]*)\.([a-z]{2,3})"; // Checking host name and/or IP
$regex .= "(\:[0-9]{2,5})?"; // Check it it has port number
$regex .= "(\/([a-z0-9+\$_-]\.?)+)*\/?"; // The real path
$regex .= "(\?[a-z+&\$_.-][a-z0-9;:@&%=+\/\$_.-]*)?"; // Check the query string params
$regex .= "(#[a-z_.-][a-z0-9+\$_.-]*)?"; // Check anchors if are used.
You can ignore any section which you may not need. As you see I am concatenating them
Upvotes: 0