Reputation: 11
I am trying to create an AWS Athena table from a json event file stored in S3. I seem to be having trouble with format of my json event. The event is delivered in this format:
"[{\"String1\":123,\"String2\":\"abc\",\"String3\":\"def\"}]"
When I create the table it doesn't show any data as I don't think it can read the json string. My table creation code is:
CREATE EXTERNAL TABLE events (
String1 string,
String2 string,
String3 string
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3://events/';
I'm pretty sure I need other config to have the format of my json event to be parsed correctly, but I'm not sure what. If I set my json event to the below, it all creates and works as intended.
{"String1":"123","String2":"abc","String3":"def"}
Does anyone have any pointers as to what I need to do for my file format to be read/parsed correctly?
Thanks.
UPDATE
I've managed to get my json data delivered without the \
, so now just need to handle the start and ending brackets [...]
,
[{"String1":123,"String2":"abc","String3":"def"}]
as this also causes an issue with my table, as puts all the data into the first row. Without the [...] it is placed correctly. I think I need to use an array, so looking at that.
Upvotes: 0
Views: 2291
Reputation: 11
I've resolved this using an array, as most will have expected.
My Athena create table query was therefore:
CREATE EXTERNAL TABLE events (
`details` array<struct<
String1:string,
String2:string,
String3:string >>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3://events/';
This gave me a parsed output that I could then run select's on or create a view to show my data in table format:
CREATE OR REPLACE VIEW "v_events" AS
SELECT
item.string1,
item.string2,
item.string3
FROM
(events
CROSS JOIN UNNEST("details") t (item))
Upvotes: 1