Nouha
Nouha

Reputation: 3

how to make Spark avro reader stop infering type when reading a partition

Trying to read an avro file partitionned by year month and day from hdfs . I was expecting the partitions to be read as string but the type was infered to integer

So say i am reading avro files from hdfs that exists on a partitions year ,month and day . Here is a sample:

val df = spark.reader.format("avro").load("Path_Till_Partition"/year=2023/month=02) df.show()

When i open the dataframe i transforms the value of the col("month") to 2 instead of "02" and marks it as an integer . Is there any way to make it read the partitions as Strings instead ?

Thanks

Upvotes: 0

Views: 76

Answers (1)

Saâd
Saâd

Reputation: 16

You should use the conf :

spark.conf.set("spark.sql.sources.partitionColumnTypeInference.enabled", "false")

When type inference is disabled, string type will be used for the partitioning columns.

Upvotes: 0

Related Questions