Reputation: 664
I have a huge data set with almost 600 columns but, while I am trying to create a DF it is failing with
Exception in thread "main" java.lang.ClassFormatError: Too many arguments in method signature in class file
Sample code:
def main(args: Array[String]): Unit = {
val data = sc.textFile(file);
val rd = data.map(line => line.split(",")).map(row => new Parent(row(0), row(1), ........row(600)))
rd.toDF.write.mode("append").format("orc").insertInto("Table")
}
Can someone help how to perform workaround for this?.
Upvotes: 0
Views: 668
Reputation: 774
I believe there is a limit on maximum method arguments for a Java object, which therefore extends to Scala object as well. A Person class with 600 params would be unfeasible.
The most ideal solution would be to read the csv natively as:
spark.read.csv(filePath)
Additionally, you may choose to increase the maxColumns option, using the signature.
spark.read.options().csv()
While it does not directly affect your use-case, the max-columns is set to 20480. More information on these parameters can be found here.
Upvotes: 2