Reputation: 41
I am using spark 1.6 and trying to write a large Dataframe of Size 11GB using below statement but it's giving me may be due to large partition size of 2GB+
Caused by: java.lang.RuntimeException: java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE
df.write.mode("append").partitionBy("audit_month").parquet("/data/sometable")
Is there any workaround available for this to create multiple partitions internally while writing but i want to keep end result as /data/sometable/audit_month=08-2018/ ?.
Upvotes: 0
Views: 646
Reputation: 1096
This works for me:
df.write.mode("append").parquet("/data/sometable/audit_month="+audit_month)
Upvotes: 1