user8373873
user8373873

Reputation: 41

Repartition in Spark during Write with Partitionby

I am using spark 1.6 and trying to write a large Dataframe of Size 11GB using below statement but it's giving me may be due to large partition size of 2GB+

Caused by: java.lang.RuntimeException: java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE


df.write.mode("append").partitionBy("audit_month").parquet("/data/sometable")

Is there any workaround available for this to create multiple partitions internally while writing but i want to keep end result as /data/sometable/audit_month=08-2018/ ?.

Upvotes: 0

Views: 646

Answers (1)

Filippo Loddo
Filippo Loddo

Reputation: 1096

This works for me:

df.write.mode("append").parquet("/data/sometable/audit_month="+audit_month)

Upvotes: 1

Related Questions