Reputation: 39
I have Parquet File with timestamp column serialized as Long (BigInt). My Hive table is as below:
Create external table my_table (
column_1 string,
column_2 timestamp)
partitioned by (
column_1 string)
Row format Serde
'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
Stored as inputformat
'org.apache.hadoop.hive.ql.io.parquet.MapredParqueetInputFormat'
Outputformat
'org.apache.hadoop.hive.ql.io.parquet.MapredParqueetOutputFormat'
Location
'maprfs/datalake/mydomain/test/db'
When I am using select query, Hive is unable to deserialize the bigint in the parquet file to timestamp. Getting below error:
Error: java.io..IOException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.hive.serde2.io.TimestampWritable (state=,code=0)
Is there a way to define Serde Properties for ParquetHiveSerde ? Is there a way to keep using timestamp as the column_2 type and let Hive parse the timestamp based on format definition?
Upvotes: 2
Views: 508