Detecting URLs with RegEx - how to prevent double-linking?

Question

I have the following to detect and replace links:

        // need to find anchors
        Regex urlRx = new Regex(@"((https?|ftp|file)\://|www.)[A-Za-z0-9\.\-]+(/[A-Za-z0-9\?\#\&\=;\+!'\*\-\._~%]*)*", RegexOptions.IgnoreCase);
        MatchCollection matches = urlRx.Matches(source);
        foreach (Match match in matches)
        {
            source = source.Replace(match.Value, "" + match.Value + "");
        }

however when source contains an anchor, this doesn't quite work because it replaces the innards of the already-existing anchor with another anchor. How can I prevent this from happening?

Sample i/o:

http://www.google.com   ->    href="http://www.google.com">http://www.google.com
Pre-existing anchors () -> unchanged

I think preventing matching any url preceded by a non-whitespace character (or quote ") would be valid, but I don't know how to do that.

Detecting URLs with RegEx - how to prevent double-linking?

Answers (1)

Related Questions