Reputation: 182
I have a Google Big Query Table column, which contains a json string. Over time, new keys were introduced to this json string.
time col_b
--------------------------------------------------
timestamp3 {"key1": "value", "key2": "value"}
timestamp2 {"key1": "value"}
timestamp1 {"key1": "value"}
How can I extract key2 and in case it doesn't exist fill with NaN? I thought about something like:
SELECT
JSON_EXTRACT(col_b, "$.key2) AS key2
FROM db;
Doing this will throw the error, so it thinks it should be a double instead of a string cause of the null values.
Bad double type: value
How can I check in advance whether key2 is existing and if not fill with null?
Upvotes: 2
Views: 5412
Reputation: 71
Had similar problem, but was looking for a key that could be nullable and was interested if the key was present, despite it being a null. I found example above to be incorrect in my specific case, here's my example below - hope someone will find it useful.
#standardSQL
CREATE TEMPORARY FUNCTION
jsonHasKey(libs STRING, key STRING)
RETURNS BOOL
LANGUAGE js AS '''
try {
x = JSON.parse(libs);
return x.hasOwnProperty(key);
} catch (e) {
return false;
}
''';
WITH `project.dataset.table` AS (
SELECT 'timestamp3' time, '{"key1": "value", "key2": "value"}' col_b UNION ALL
SELECT 'timestamp2', '{"key1": "value"}' UNION ALL
SELECT 'timestamp1', '{"key1": "value"}' UNION ALL
SELECT 'timestamp0', '{"key1": "value", "key2": null}'
)
SELECT *, IFNULL(JSON_EXTRACT_SCALAR(col_b, '$.key2'), 'NaN') AS key2, jsonHasKey(col_b, 'key2') AS is_present
FROM `project.dataset.table`
which produces the following results:
time col_b key2 is_present
timestamp3 "{""key1"": ""value"", ""key2"": ""value""}" value true
timestamp2 "{""key1"": ""value""}" NaN false
timestamp1 "{""key1"": ""value""}" NaN false
timestamp0 "{""key1"": ""value"", ""key2"": null}" NaN true
Upvotes: 3
Reputation: 172993
See below example
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'timestamp3' time, '{"key1": "value", "key2": "value"}' col_b UNION ALL
SELECT 'timestamp2', '{"key1": "value"}' UNION ALL
SELECT 'timestamp1', '{"key1": "value"}'
)
SELECT *, IFNULL(JSON_EXTRACT_SCALAR(col_b, '$.key2'), 'NaN') AS key2
FROM `project.dataset.table`
with output
Row time col_b key2
1 timestamp3 {"key1": "value", "key2": "value"} value
2 timestamp2 {"key1": "value"} NaN
3 timestamp1 {"key1": "value"} NaN
Upvotes: 5