Reputation: 7856
I am able to break up paragraphs of text into substrings based upon nth given character limit. The conflict I have is that my algorithm is doing exactly this, and is breaking up words. This is where I am stuck. If the character limit occurs in the middle of a word, how can I back track to a space so that all my substrings have entire words?
This is the algorithm I am using
int arrayLength = 0;
arrayLength = (int) Math.ceil(((mText.length() / (double) charLimit)));
String[] result = new String[arrayLength];
int j = 0;
int lastIndex = result.length - 1;
for (int i = 0; i < lastIndex; i++) {
result[i] = mText.substring(j, j + charLimit);
j += charLimit;
}
result[lastIndex] = mText.substring(j);
I am setting the charLimit variable with any nth, integer value. And mText is string with a paragraph of text. Any suggestions on how I can improve this? Thank you in advance.
I am receiving good responses, just so you know what I did to figure out of I landed on a space or not, I used this while loop. I just do not know how to correct from this point.
while (!strTemp.substring(strTemp.length() - 1).equalsIgnoreCase(" ")) {
// somehow refine string before added to array
}
Upvotes: 2
Views: 168
Reputation: 3188
Not sure if I understood correctly what you wanted but an answer to my interpretation:
You could find the last space before your character limit with lastIndexOf and then check if you are close enough to your limit (for text without whitespace) i.e.:
int arrayLength = 0;
arrayLength = (int) Math.ceil(((mText.length() / (double) charLimit)));
String[] result = new String[arrayLength];
int j = 0;
int tolerance = 10;
int splitpoint;
int lastIndex = result.length - 1;
for (int i = 0; i < lastIndex; i++) {
splitpoint = mText.lastIndexOf(' ' ,j+charLimit);
splitpoint = splitpoint > j+charLimit-tolerance ? splitpoint:j+charLimit;
result[i] = mText.substring(j, splitpoint).trim();
j = splitpoint;
}
result[lastIndex] = mText.substring(j).trim();
this will search for the last space before charLimit
(example value) and either split the string there if it is less then tolerance
away or split at charLimit
if it isn't.
Only problem with this solution is that the last Stringtoken can be longer than charLimit
so you might need to adjust arrayLength
and loop while (mText - j > charLimit)
Edit
running sample code:
public static void main(String[] args) {
String mText = "I am able to break up paragraphs of text into substrings based upon nth given character limit. The conflict I have is that my algorithm is doing exactly this, and is breaking up words. This is where I am stuck. If the character limit occurs in the middle of a word, how can I back track to a space so that all my substrings have entire words?";
int charLimit = 40;
int arrayLength = 0;
arrayLength = (int) Math.ceil(((mText.length() / (double) charLimit)));
String[] result = new String[arrayLength];
int j = 0;
int tolerance = 10;
int splitpoint;
int lastIndex = result.length - 1;
for (int i = 0; i < lastIndex; i++) {
splitpoint = mText.lastIndexOf(' ' ,j+charLimit);
splitpoint = splitpoint > j+charLimit-tolerance ? splitpoint:j+charLimit;
result[i] = mText.substring(j, splitpoint);
j = splitpoint;
}
result[lastIndex] = mText.substring(j);
for (int i = 0; i<arrayLength; i++) {
System.out.println(result[i]);
}
}
Output:
I am able to break up paragraphs of text
into substrings based upon nth given
character limit. The conflict I have is
that my algorithm is doing exactly
this, and is breaking up words. This is
where I am stuck. If the character
limit occurs in the middle of a word,
how can I back track to a space so that
all my substrings have entire words?
Additional Edit: added trim() as per suggestion by curiosu. It removes whitespace surroundig the string tokens.
Upvotes: 3