Reputation: 1197
Question
How can the following be done in an idiomatic way:
Split a large String
into a list of String
s, each not exceeding the given size, and avoiding splitting words in half.
Closest solution with String.chunked()
(Splits words)
The closest solution to this is using the String class's chunked()
method. However, the problem with this is that
it splits words in the given String
.
Code example of use of String.chunked()
val longString = "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod " +
"tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, " +
"quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo " +
"consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse " +
"cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non " +
"proident, sunt in culpa qui officia deserunt mollit anim id est laborum. "
// Split [longString] into list
var listOfStrings = longString.chunked(40)
listOfStrings.forEach {
println(it)
}
Example output of closest example with String.chunked()
Below is the output received by running the example code provided. As can be seen, the words are split at the end of the lines.
Lorem ipsum dolor sit amet, consectetur
adipisicing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna ali
qua. Ut enim ad minim veniam, quis nostr
ud exercitation ullamco laboris nisi ut
aliquip ex ea commodo consequat. Duis au
te irure dolor in reprehenderit in volup
tate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint occaecat
cupidatat non proident, sunt in culpa qu
i officia deserunt mollit anim id est la
borum.
Upvotes: 4
Views: 1064
Reputation: 19926
You could use this simple helper function:
fun splitIntoChunks(max: Int, string: String): List<String> = ArrayList<String>(string.length / max + 1).also {
var firstWord = true
val builder = StringBuilder()
// split string by whitespace
for (word in string.split(Regex("( |\n|\r|\n\r)+"))) {
// if the current string exceeds the max size
if (builder.length + word.length > max) {
// then we add the string to the list and clear the builder
it.add(builder.toString())
builder.setLength(0)
firstWord = true
}
// append a space at the beginning of each word, except the first one
if (firstWord) firstWord = false else builder.append(' ')
builder.append(word)
}
// add the last collected part if there was any
if(builder.isNotEmpty()){
it.add(builder.toString())
}
}
Which then can be called simply like this:
val chunks: List<String> = splitIntoChunks(20, longString)
Upvotes: 1
Reputation: 23262
Not really the most idiomatic way I found, but maybe it suffices your needs:
fun String.chunkedWords(limitChars: Int,
delimiter: Char = ' ',
joinCharacter: Char = '\n') =
splitToSequence(delimiter)
.reduce { cumulatedString, word ->
val exceedsSize = cumulatedString.length - cumulatedString.indexOfLast { it == joinCharacter } + "$delimiter$word".length > limitChars
cumulatedString + if (exceedsSize) {
joinCharacter
} else {
delimiter
} + word
}
You can then use it as follows:
longText.chunkedWords(40).run(::println)
which for your given string would then print:
Lorem ipsum dolor sit amet, consectetur
adipisicing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna
aliqua. Ut enim ad minim veniam, quis
nostrud exercitation ullamco laboris
nisi ut aliquip ex ea commodo consequat.
Duis aute irure dolor in reprehenderit
in voluptate velit esse cillum dolore eu
fugiat nulla pariatur. Excepteur sint
occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim
id est laborum.
You could also split it to lines from there, e.g. longText.chunkedWords(40).splitAsSequence("\n")
. Note that it also splits nicely if there are already new-line characters in the string, i.e. if you have a String
like "Testing shorter lines.\nAnd now there comes a very long line"
a call of .chunkedWords(17)
will produce the following output:
Testing shorter
lines.
And now there // this tries to use the whole 17 characters again
comes a very
long line
Upvotes: 5