SB2055
SB2055

Reputation: 12892

Detecting URLs with RegEx - how to prevent double-linking?

I have the following to detect and replace links:

        // need to find anchors
        Regex urlRx = new Regex(@"((https?|ftp|file)\://|www.)[A-Za-z0-9\.\-]+(/[A-Za-z0-9\?\#\&\=;\+!'\(\)\*\-\._~%]*)*", RegexOptions.IgnoreCase);
        MatchCollection matches = urlRx.Matches(source);
        foreach (Match match in matches)
        {
            source = source.Replace(match.Value, "<a  target=\"_blank\" href='" + match.Value + "'>" + match.Value + "</a>");
        }

however when source contains an anchor, this doesn't quite work because it replaces the innards of the already-existing anchor with another anchor. How can I prevent this from happening?

Sample i/o:

http://www.google.com   ->   <a target="blank"> href="http://www.google.com">http://www.google.com</a>
Pre-existing anchors (<a></a>) -> unchanged

I think preventing matching any url preceded by a non-whitespace character (or quote ") would be valid, but I don't know how to do that.

Upvotes: 0

Views: 109

Answers (1)

Ozesh
Ozesh

Reputation: 6974

All you need is to check if there is already a pre-existing anchor

        Regex urlRx = new Regex(@"((https?|ftp|file)\://|www.)[A-Za-z0-9\.\-]+(/[A-Za-z0-9\?\#\&\=;\+!'\(\)\*\-\._~%]*)*", RegexOptions.IgnoreCase);
        MatchCollection matches = urlRx.Matches(source);

        var rxAnchor = new Regex("<a [^>]*href=(?:'(?<href>.*?)')|(?:\"(?<href>.*?)\")", RegexOptions.IgnoreCase);

        foreach (Match match in matches)
        {
            List<string> urls = rxAnchor.Matches(source).OfType<Match>().Select(m => m.Groups["href"].Value).ToList();

            if (urls != null && urls.Count() > 0)
            {
                string urlToAppend = urls[0];
                // DO Your Stuff here
            }
            else
            {
                source = source.Replace(match.Value, "<a  target=\"_blank\" href='" + match.Value + "'>" + match.Value + "</a>");
            }
        }

Upvotes: 1

Related Questions