Reputation: 9308
Any ideas how I can use a single regular expression to validate a single url and also match urls in a text block?
var x = "http://myurl.com";
var t = "http://myurl.com ref";
var y = "some text that contains a url http://myurl.com some where";
var expression = "\b(https?|ftp|file)://[-A-Z0-9+&@#/%?=~_|!:,.;]*[A-Z0-9+&@#/%=~_|]";
Regex.IsMatch(x, expression, RegexOptions.IgnoreCase); // returns true;
Regex.IsMatch(t, expression, RegexOptions.IgnoreCase); // returns false;
Regex.Matches(y, expression, RegexOptions.IgnoreCase); // returns http://myurl.com;
Upvotes: 0
Views: 1030
Reputation: 15577
i think the word boundary is getting you; it will not match for non-word characters.
try this:
var expression = @"(^|\s)(https?|ftp|file)://[-A-Z0-9+&@#/%?=~_|!:,.;]*[A-Z0-9+&@#/%=~_|]($|\s)";
this will bind the start of the match to the beginning of the string or space, and the end of the match to the end of the string or space.
more info: http://www.regular-expressions.info/wordboundaries.html
There are three different positions that qualify as word boundaries:
Before the first character in the string, if the first character is a word character. After the last character in the string, if the last character is a word character. Between two characters in the string, where one is a word character and the other is not a word character. Simply put: \b allows you to perform a "whole words only" search using a regular expression in the form of \bword\b. A "word character" is a character that can be used to form words. All characters that are not "word characters" are "non-word characters".
Upvotes: 0
Reputation: 2042
First of all you have to escape correctly. Use "\\b..."
instead of "\b..."
. IsMatch
will also be true for partial matches. You can check if the whole input is matching by doing this:
Match match = Regex.Match(x, expression, RegexOptions.IgnoreCase);
if (match.Success && match.Length == x.Length))
// full match
With this check and the escape fix, your expression will work as it is. You also can write a helper method for it:
private bool FullMatch(string input, string pattern, RegexOptions options)
{
Match match = Regex.Match(input, pattern, options);
return match.Success && match.Length == input.Length;
}
Your code will change to this:
var x = "http://myurl.com";
var t = "http://myurl.com ref";
var y = "some text that contains a url http://myurl.com some where";
var expression = "\\b(https?|ftp|file)://[-A-Z0-9+&@#/%?=~_|!:,.;]*[A-Z0-9+&@#/%=~_|]";
FullMatch(x, expression, RegexOptions.IgnoreCase); // returns true;
FullMatch(t, expression, RegexOptions.IgnoreCase); // returns false;
Regex.Matches(y, expression, RegexOptions.IgnoreCase); // returns http://myurl.com;
Upvotes: 1