AMZFR
AMZFR

Reputation: 103

Trim String in Java while preserve full word

I need to trim a String in java so that:

The quick brown fox jumps over the laz dog.

becomes

The quick brown...

In the example above, I'm trimming to 12 characters. If I just use substring I would get:

The quick br...

I already have a method for doing this using substring, but I wanted to know what is the fastest (most efficient) way to do this because a page may have many trim operations.

The only way I can think off is to split the string on spaces and put it back together until its length passes the given length. Is there an other way? Perhaps a more efficient way in which I can use the same method to do a "soft" trim where I preserve the last word (as shown in the example above) and a hard trim which is pretty much a substring.

Thanks,

Upvotes: 10

Views: 9070

Answers (7)

Bohemian
Bohemian

Reputation: 425448

Here is a simple, regex-based, 1-line solution:

str.replaceAll("(?<=.{12})\\b.*", "..."); // How easy was that!? :)

Explanation:

  • (?<=.{12}) is a negative look behind, which asserts that there are at least 12 characters to the left of the match, but it is a non-capturing (ie zero-width) match
  • \b.* matches the first word boundary (after at least 12 characters - above) to the end

This is replaced with "..."

Here's a test:

public static void main(String[] args) {
    String input = "The quick brown fox jumps over the lazy dog.";
    String trimmed = input.replaceAll("(?<=.{12})\\b.*", "...");
    System.out.println(trimmed);
}

Output:

The quick brown...

If performance is an issue, pre-compile the regex for an approximately 5x speed up (YMMV) by compiling it once:

static Pattern pattern = Pattern.compile("(?<=.{12})\\b.*");

and reusing it:

String trimmed = pattern.matcher(input).replaceAll("...");

Upvotes: 9

bashizip
bashizip

Reputation: 581

I use this hack : suppose that the trimmed string must have 120 of length :

String textToDisplay = textToTrim.substring(0,(textToTrim.length() > 120) ? 120 : textToTrim.length());

        if (textToDisplay.lastIndexOf(' ') != textToDisplay.length() &&textToDisplay.length()!=textToTrim().length()) {

            textToDisplay = textToDisplay + textToTrim.substring(textToDisplay.length(),textToTrim.indexOf(" ", textToDisplay.length()-1))+ " ...";
        }

Upvotes: 0

Highly Irregular
Highly Irregular

Reputation: 40819

How about:

mystring = mystring.replaceAll("^(.{12}.*?)\b.*$", "$1...");

Upvotes: 0

Ali
Ali

Reputation: 12684

Below is a method I use to trim long strings in my webapps. The "soft" boolean as you put it, if set to true will preserve the last word. This is the most concise way of doing it that I could come up with that uses a StringBuffer which is a lot more efficient than recreating a string which is immutable.

public static String trimString(String string, int length, boolean soft) {
    if(string == null || string.trim().isEmpty()){
        return string;
    }

    StringBuffer sb = new StringBuffer(string);
    int actualLength = length - 3;
    if(sb.length() > actualLength){
        // -3 because we add 3 dots at the end. Returned string length has to be length including the dots.
        if(!soft)
            return escapeHtml(sb.insert(actualLength, "...").substring(0, actualLength+3));
        else {
            int endIndex = sb.indexOf(" ",actualLength);
            return escapeHtml(sb.insert(endIndex,"...").substring(0, endIndex+3));
        }
    }
    return string;
}

Update

I've changed the code so that the ... is appended in the StringBuffer, this is to prevent needless creations of String implicitly which is slow and wasteful.

Note: escapeHtml is a static import from apache commons:

import static org.apache.commons.lang.StringEscapeUtils.escapeHtml;

You can remove it and the code should work the same.

Upvotes: 11

Tran Dinh Thoai
Tran Dinh Thoai

Reputation: 702

Please try following code:

private String trim(String src, int size) {
    if (src.length() <= size) return src;
    int pos = src.lastIndexOf(" ", size - 3);
    if (pos < 0) return src.substring(0, size);
    return src.substring(0, pos) + "...";
}

Upvotes: 4

duffymo
duffymo

Reputation: 309028

Your requirements aren't clear. If you have trouble articulating them in a natural language, it's no surprise that they'll be difficult to translate into a computer language like Java.

"preserve the last word" implies that the algorithm will know what a "word" is, so you'll have to tell it that first. The split is a way to do it. A scanner/parser with a grammar is another.

I'd worry about making it work before I concerned myself with efficiency. Make it work, measure it, then see what you can do about performance. Everything else is speculation without data.

Upvotes: 0

ikromm
ikromm

Reputation: 523

Try searching for the last occurence of a space that is in a position less or more than 11 and trim the string there, by adding "...".

Upvotes: 0

Related Questions