Reputation: 519
I have a table structure as below.
CREATE TABLE db.TEST(
f1 string,
f2 string,
f3 string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
'input.regex'='(.{2})(.{3})(.{4})' )
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'hdfs://nameservice1/location/TEST';
I tried to insert a record into the table as below.
insert overwrite table db.TEST2
select '12' as a , '123' as b , '1234' as c ;
While trying to insert data into the table, facing the below error.
Caused by: java.lang.UnsupportedOperationException: Regex SerDe doesn't support the serialize() method at org.apache.hadoop.hive.serde2.RegexSerDe.serialize(RegexSerDe.java:289)
Any idea what is going wrong?
Upvotes: 1
Views: 1407
Reputation: 38335
You are using wrong SerDe class. org.apache.hadoop.hive.serde2.RegexSerDe does not support serialization. Look at the source code - serialize method does nothing but throws UnsupportedOperationException
exception:
public Writable serialize(Object obj, ObjectInspector objInspector)
throws SerDeException {
throw new UnsupportedOperationException(
"Regex SerDe doesn't support the serialize() method");
}
And the solution is
to use another SerDe class:
org.apache.hadoop.hive.contrib.serde2.RegexSerDe, it can serialize the row object using a format string. Serialize format should be specified in the SERDEPROPERTIES
. Look at the source code for more details.
Example of SerDe properties:
WITH SERDEPROPERTIES ( 'input.regex' = '(.{2})(.{3})(.{4})','output.format.string' = '%1$2s%2$3s%3$4s')
For your table it will be like this:
CREATE TABLE db.TEST(
f1 string,
f2 string,
f3 string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
'input.regex'='(.{2})(.{3})(.{4})',
'output.format.string' = '%1$2s%2$3s%3$4s' )
LOCATION
'hdfs://nameservice1/location/TEST';
Upvotes: 3