manoj kumar
manoj kumar

Reputation: 11

sqoop import error for timestamp coulmn in parquet table

I'm getting an error while mapping SQL Server table to parquet table. I have made parquet table to match SQL Server table with corresponding column data type.

But sqoop infer timestamp column as long. which creates a problem in loading data to parquet table. Loading data to parquet seems to be successful but fetching is a problem.

Error Message:

hive> select updated_at from bkfs.address_par1;
OK
Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.hive.serde2.io.TimestampWritable
Time taken: 0.146 seconds

Upvotes: 0

Views: 2754

Answers (2)

Mohan Kumar
Mohan Kumar

Reputation: 31

Sqoop parquet import interprets the Date and timestamp Oracle data types as Long. Which is trying to get date in unix epoch format. So, importing can be handled like below,

sqoop import \
--connect [connection string] \
--username [username] \
--password [password] \
--query "select to_char(date_col,'YYYY-MM-DD HH:mi:SS.SS') as date_col from test_table where \$CONDITIONS" \
--as-parquetfile \
-m 1 \
--delete-target-dir \
--target-dir /sample/dir/path/hive_table

Upvotes: 2

Sathiyan S
Sathiyan S

Reputation: 1023

you can have a look at the below question posted already,

{Sqoop function '--map-column-hive' being ignored}

Upvotes: 0

Related Questions