Reputation: 21
I am crawling data from Google Big Query and staging them into Athena. One of the columns crawled as string, contains json :
{
"key": "Category",
"value": {
"string_value": "something"
}
I need to unnest these and flatten them to be able to use them in a query. I require key and string value (so in my query it will be where Category = something
I have tried the following :
WITH dataset AS (
SELECT cast(json_column as json) as json_column
from "thedatabase"
LIMIT 10
)
SELECT
json_extract_scalar(json_column, '$.value.string_value') AS string_value
FROM dataset
which is returning null. Casting the json_column as json adds \ into them :
"[{\"key\":\"something\",\"value\":{\"string_value\":\"app\"}}
If I use replace on the json, it doesn't allow me as it's not a varchar object.
So how do I extract the values from the some_column field?
Upvotes: 0
Views: 968
Reputation: 142173
json_extract_scalar
actually supports extracting just from the varchar (string) value :-- sample data
WITH dataset(json_column) AS (
values ('{
"key": "Category",
"value": {
"string_value": "something"
}}')
)
--query
SELECT
json_extract_scalar(json_column, '$.value.string_value') AS string_value
FROM dataset;
Output:
string_value |
---|
something |
json_parse
(in this particular case it is not needed, but there are cases when you will want to use it):-- query
SELECT
json_extract_scalar(json_parse(json_column), '$.value.string_value') AS string_value
FROM dataset;
Upvotes: 1