Reputation: 3
Trying to read an avro file partitionned by year month and day from hdfs . I was expecting the partitions to be read as string but the type was infered to integer
So say i am reading avro files from hdfs that exists on a partitions year ,month and day . Here is a sample:
val df = spark.reader.format("avro").load("Path_Till_Partition"/year=2023/month=02) df.show()
When i open the dataframe i transforms the value of the col("month") to 2 instead of "02" and marks it as an integer . Is there any way to make it read the partitions as Strings instead ?
Thanks
Upvotes: 0
Views: 76
Reputation: 16
You should use the conf :
spark.conf.set("spark.sql.sources.partitionColumnTypeInference.enabled", "false")
When type inference is disabled, string type will be used for the partitioning columns.
Upvotes: 0