Reputation: 165
I have a file in S3, and Presto running on EMR. I see I can use Json_extract to read the json.
I am running the following query, however, I keep seeing null instead of the correct value.
select json_extract('s3a://random-s3-bucket/analytics/20210221/myjsonfile.json', '$.dateAvailability')
I see this output
Not sure if my syntax is wrong? Thoughts?
Upvotes: 0
Views: 2822
Reputation: 151
If you need a tool to help you create the table statement, try this one: https://www.hivetablegenerator.com
From the page:
Easily convert any JSON (even complex Nested ones), CSV, TSV, or Log sample file to an Apache HiveQL DDL create table statement.
Upvotes: 1
Reputation: 20770
json_extract()
operates on JSON scalar values kept in memory. It does not load data from an external location. See documentation page for usage examples.
In order to query a JSON file using Trino (formerly known as Presto SQL), you need to map it as a table with JSON format like this:
CREATE TABLE my_table ( .... )
WITH (
format = 'JSON',
external_location = 's3a://random-s3-bucket/analytics/20210221'
);
See more information in Hive connector documentation.
Upvotes: 1