Kartik
Kartik

Reputation: 39

Hive Parquet Timestamp Serde issues

I have Parquet File with timestamp column serialized as Long (BigInt). My Hive table is as below:

Create external table my_table (
   column_1 string,
   column_2 timestamp)
partitioned by (
   column_1 string)
Row format Serde
    'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
Stored as inputformat
    'org.apache.hadoop.hive.ql.io.parquet.MapredParqueetInputFormat'
Outputformat
    'org.apache.hadoop.hive.ql.io.parquet.MapredParqueetOutputFormat'
Location
    'maprfs/datalake/mydomain/test/db'

When I am using select query, Hive is unable to deserialize the bigint in the parquet file to timestamp. Getting below error:

Error: java.io..IOException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.hive.serde2.io.TimestampWritable (state=,code=0)

Is there a way to define Serde Properties for ParquetHiveSerde ? Is there a way to keep using timestamp as the column_2 type and let Hive parse the timestamp based on format definition?

Upvotes: 2

Views: 508

Answers (0)

Related Questions