ishan3243
ishan3243

Reputation: 1928

Athena: Rename columns while creating table from Json data

Some of the keys in the json data (in S3) contains special a character (+). So I want to rename those fields while creating the table in Athena. I found this doc which talks about a ColumnToJsonKeyMappings and I tried using that but my renamed field is not populating with any data.

Here is the sample DDL I tried:

CREATE EXTERNAL TABLE `table_name`(
  `st` string COMMENT 'from deserializer', 
  `eid` string COMMENT 'from deserializer', 
  `cid_sid_et` string COMMENT 'from deserializer', 
  `v` string COMMENT 'from deserializer')
ROW FORMAT SERDE 
  'org.openx.data.jsonserde.JsonSerDe' 
WITH SERDEPROPERTIES ( 
  'paths'='cid+sid+et,eid,st,v',
  'ignore.malformed.json'='true',
  'ColumnToJsonKeyMappings'='{"cid_sid_et":"cid+sid+et"}') 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  's3://location-of-data'
TBLPROPERTIES (
  'classification'='json')

Upvotes: 1

Views: 981

Answers (1)

ishan3243
ishan3243

Reputation: 1928

This worked by using the following format:

CREATE EXTERNAL TABLE `table_name`(
  `st` string COMMENT 'from deserializer', 
  `eid` string COMMENT 'from deserializer', 
  `cid_sid_et` string COMMENT 'from deserializer', 
  `v` string COMMENT 'from deserializer')
ROW FORMAT SERDE 
  'org.openx.data.jsonserde.JsonSerDe' 
WITH SERDEPROPERTIES ( 
  'paths'='cid_sid_et,eid,st,v',
  'ignore.malformed.json'='true',
  'mapping.cid_sid_et'='cid+sid+et') 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  's3://location-of-data'
TBLPROPERTIES (
  'classification'='json')

Upvotes: 2

Related Questions