Nick
Nick

Reputation: 5872

Limiting the number of characters using regular expression

I am using the following regular expression to turn all URLs within a string into full hyperlinks:

var r = new Regex("(https?://[^ ]+)");
return r.Replace(Action, "<a target=\"_blank\" href=\"$1\">$1</a>");

I would like to limit the number of characters shown within the tags and, if possible, add ellipses if the length is exceeded.

e.g. http://myurl.com/my/route/th...

I have tried (unsuccessfully) using lookarounds to achieve this and wonder if anybody has a better solution?

Upvotes: 3

Views: 1716

Answers (3)

James
James

Reputation: 82096

The following regex would give you what you are after

((https?://[^ ]){20}[^ ]+)

What this is doing is creating 2 capture groups

  1. Capture the entire URL
  2. Capture the URL up to a specific length (in this example 20)

All that's required is to add the truncation e.g.

Regex.Replace(Action, "((https?://[^ ]){20}[^ ]+)", "<a target=\"_blank\" href=\"$1\">$2...</a>"));

See it in action.


As pointed out in the comments, the above would result in the ... being appended to all the URLs (even ones which don't exceed the length). Given the variability of this using only a regex here probably isn't viable. We can, however, resolve this with a small tweak to the regex and some simple string manipulation e.g.

var match = Regex.Match(Action, "(https?://[^ ]{50})?[^ ]+");
// if the display part group has matched something, we need to truncate
var displayText = match.Groups[1].Length > 0 ? String.Format("{0}...", match.Groups[1]) : match.ToString();
Console.WriteLine(String.Format("<a target=\"_blank\" href=\"{0}\">{1}</a>", match, displayText));

I have updated the example

Upvotes: 1

Kendall Frey
Kendall Frey

Reputation: 44316

This is best solved by a custom match evaluator, by using a function to do the replacing.

string Action = "Somebody turn http://stackoverflow.com/questions/20494457/limiting-the-number-of-characters-using-regular-expression into a link please.";
var r = new Regex("(https?://\\S+)");
return r.Replace(Action,
    match => {
        string display = match.Value;
        if (display.Length > 30)
        {
            display = display.Substring(0, 30) + "...";
        }
        return "<a target=\"_blank\" href=\"" + match.Value + "\">" + display + "</a>";
    });

Returns:

Somebody turn <a target="_blank" href="http://stackoverflow.com/questions/20494457/limiting-the-number-of-characters-using-regular-expression">http://stackoverflow.com/quest...</a> into a link please.

Upvotes: 1

Rawling
Rawling

Reputation: 50104

I don't think it's possible to do this with a simple regex replacement, but luckily .NET allows you to perform a much more complex replacement. First we set up the regex to capture the first (e.g.) 25 characters after the start of the URL in one group, and any further characters in a second, optional group:

var r = new Regex("(https?://[^ ]{1,25})([^ ]+)?");

This second group will fail completely if there are fewer than 25 character after the start of the URL, but it's optional so it won't make the regex as a whole fail.

Then, when replacing, we check whether the second group matched when deciding whether or not to add the dots:

var s = r.Replace(
    Action,
    m => string.Concat(
        "<a target=\"_blank\" href=\"",
        m.Value,
        "\">",
        m.Groups[1].Value,
        (m.Groups[2].Success ? "..." : ""),
        "</a>"));

For the input

"hello http://www.google.com world
 http://www.loooooooooooooooooooooooooooooooooooooooooong.com !"

I get output

hello <a target="_blank" href="http://www.google.com">http://www.google.com</a>
world <a target="_blank"
    href="http://www.loooooooooooooooooooooooooooooooooooooooooong.com">
    http://www.loooooooooooooooooooo...</a> !

Upvotes: 0

Related Questions