glen maxwell
glen maxwell

Reputation: 203

How do truncate string of certain length but include complete words after truncation

I want to truncate substring from string upto 60 characters , but also want to get complete words within substring. Here is what I am trying.

String originalText =" Bangladesh's first day of Test cricket on Indian soil has not been a good one. They end the day having conceded 71 runs in the last 10 overs, which meant they are already staring at a total of 356. M Vijay was solid and languid as he made his ninth Test century and third of the season. ";
String afterOptimized=originalText.substring(0, 60);
System.out.println("This is text . "+afterOptimized);

Here is output

This is text .  Bangladesh's first day of Test cricket on Indian soil has n

However mine requirement is to not cut the words in between.How do I know there is complete words or not after 60 characters.

Upvotes: 6

Views: 913

Answers (5)

Zbynek Vyskovsky - kvr000
Zbynek Vyskovsky - kvr000

Reputation: 18825

You can use regular expression for this, taking up to 60 characters and ending at word boundary:

Pattern pattern = Pattern.compile("(.{1,60})(\\b|$)(.*)");
Matcher m = pattern.match(originalText);
If (m.matches())
    afterOptimized = m.group(1);

Or, in a loop:

Pattern pattern = Pattern.compile("\\s*(.{1,60})(\\b|$)");
Matcher m = pattern.matcher(originalText);
int last = 0;
while (m.find()) {
    System.out.println(m.group(1));
    last = m.end(1);
}
if (last != originalText.length())
    System.out.println(originalText.substring(last));

You may want to replace \b with \s if you want to wrap only at white space instead of word boundary (which may wrap before comma, dots etc).

Upvotes: 5

Nick Ziebert
Nick Ziebert

Reputation: 1268

 int cutoff = originalText.substring(0,60).lastIndexOf(" ");
 String afterOptimized = originalText.substring(0, cutoff);

The code prints this: "Bangladesh's first day of Test cricket on Indian soil has"

Upvotes: 0

Varshah Sambath
Varshah Sambath

Reputation: 1

String originalText=" Bangladesh's first day of Test cricket on Indian soil has not been a good one. They end the day having conceded 71 runs in the last 10 overs, which meant they are already staring at a total of 356. M Vijay was solid and languid as he made his ninth Test century and third of the season. ";

//trim the string to 60 characters

String  trimmedString = originalText.substring(0, 60);

//re-trim if we are in the middle of a word and to get full word instead of brolken one

String result=trimmedString.substring(0, Math.min(trimmedString.length(), trimmedString.lastIndexOf(" ")));

System.out.println(result);

Upvotes: 0

slipperyseal
slipperyseal

Reputation: 2778

If the original string has a character at position 60 (the 61st char) meaning you're going to cut a word, or a word is beginning, search back from and including position 59 (the 60th char) and stop when you find a space. Then we can substring the string at that location. If the string is not long than 60 chars we just return it as is.

public void truncateTest() {
    System.out.println(truncateTo("Bangladesh's first day of Test cricket on Indian soil has not been a good one. They end the day having conceded 71 runs in the last 10 overs, which meant they are already staring at a total of 356. M Vijay was solid and languid as he made his ninth Test century and third of the season. ", 60));
    System.out.println(truncateTo("Bangladesh's first day.", 60));
    System.out.println(truncateTo("They end the day having conceded 71 runs in the last 10 overs, which meant they are already staring at a total of 356. M Vijay was solid and languid as he made his ninth Test century and third of the season.", 60));
}

public String truncateTo(String originalText, int len) {
    if (originalText.length() > len) {
        if (originalText.charAt(len) != ' ') {
            for (int x=len-1;x>=0;x--) {
                if (Character.isWhitespace(originalText.charAt(x))) {
                    return originalText.substring(0, x);
                }
            }
        }
        // default if none of the conditions are met
        return originalText.substring(0, len);
    }
    return originalText;
}

Results...

Bangladesh's first day of Test cricket on Indian soil has
Bangladesh's first day.
They end the day having conceded 71 runs in the last 10

I think I got my +1 / -1 index logic right :)

To sum up India's batting, Pujara was the epitome of patience, Vijay's shots had contempt and skipper Kohli capped it with a display of utter disdain in what turned out to be a total domination by the Indian team.

Upvotes: 1

nobjta_9x_tq
nobjta_9x_tq

Reputation: 1241

Assume your texts have space between two words, just cutting text and check if the end of char + before end char + after end char to determine what we need to cut:

if (char[i] != ' ') {
    if(i+1 == length || (i+1 < length && char[i+1] == ' '))
         return mString; // [I'm loser] bla ==> [I'm loser]
    if(i-1 > -1 && char[i-1] == ' ')
         return subHeadString(mString, 2); // return mString which has length = length - 2, ex: [I'm loser b]la ==> [I'm loser]
    return findBackStringWithSpace(mString, i); // coming back until has space char and return that sub string 
// [I'm loser bl]a ==> [I'm loser] 
} else {
    return mString;
}

Upvotes: -1

Related Questions