Animesh Sahu
Animesh Sahu

Reputation: 8096

Kotlin String max length? (Kotlin file with a long String is not compiling)

According to this answer Java can hold up to 2^31 - 1 characters. I was trying to do benchmarking and stuffs, so I tried to create a large amount of string and write it to a file like this:

import java.io.*

fun main() {
    val out = File("ouput.txt").apply { createNewFile() }.printWriter()
    sequence {
        var x = 0
        while (true) {
            yield("${++x} ${++x} ${++x} ${++x} ${++x}")
        }
    }.take(5000).forEach { out.println(it) }
    out.close()
}

And then the output.txt file contains like this:

1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
21 22 23 24 25
// ... 5000 lines

And then I copied all the contents of the file into a string for some benchmarking of some functions, so this is how it looks:

import kotlin.system.*

fun main() {
    val inputs = """
        1 2 3 4 5
        6 7 8 9 10
        11 12 13 14 15
        16 17 18 19 20
        21 22 23 24 25
        // ... 5000 lines
        24986 24987 24988 24989 24990
        24991 24992 24993 24994 24995
        24996 24997 24998 24999 25000

    """.trimIndent()
    measureNanoTime {
        inputs.reader().forEachLine { line ->
            val (a, b, c, d, e) = line.split(" ").map(String::toInt)
        }
    }.div(5000).let(::println)
}

The total character count of the file/string is 138894

String can hold up to 2147483647

But the Kotlin code does not compile (last file) It throws compilation error:

e: org.jetbrains.kotlin.codegen.CompilationException: Back-end (JVM) Internal error: wrong bytecode generated
// more lines
The root cause java.lang.IllegalArgumentException was thrown at: org.jetbrains.org.objectweb.asm.ByteVector.putUTF8(ByteVector.java:246)
    at org.jetbrains.kotlin.codegen.TransformationMethodVisitor.visitEnd(TransformationMethodVisitor.kt:92)
    at org.jetbrains.kotlin.codegen.FunctionCodegen.endVisit(FunctionCodegen.java:971)
    ... 43 more
Caused by: java.lang.IllegalArgumentException: UTF8 string too large

Here is the total log of exception with stacktrace: https://gist.github.com/Animeshz/1a18a7e99b0c0b913027b7fb36940c07

Upvotes: 4

Views: 13425

Answers (1)

Marcin Jędrzejczyk
Marcin Jędrzejczyk

Reputation: 199

There is limit in java class file, length of string constant must fit in 16 bits ie. 65535 bytes (not characters) is max length of string in source code.

The class File Format

Upvotes: 3

Related Questions