Sallyerik
Sallyerik

Reputation: 519

How can use WHERE clause in AWS Athena Json queries?

I have a table where I've stored some information from a Json object:

Table:

investment
    unit(string)
    data(string)

If a run the the query SELECT * FROM "db"."investment" limit 10; I got the following result:

Unit Data
CH  [{"from":"CH","when":"2021-02-16","who":"[email protected]"}]
AB  [{"from":"AB","when":"2020-02-16","who":"[email protected]"}]

Now, I run the following basic query to return value within the Json nested object:

SELECT json_extract_scalar(Data, '$[0].who') email FROM "db"."investment";

and I got the following result:

email
[email protected]
[email protected]

How can filter this query with WHERE clause to return just a single value:

I've tried this, but obviously it doesn't work as normal SQL table with row and columns:

SELECT json_extract_scalar(Data, '$[0].who') email FROM "db"."investment" WHERE email = "[email protected]";

Any help with this?

Upvotes: 3

Views: 3524

Answers (1)

Haleemur Ali
Haleemur Ali

Reputation: 28253

your question seems to have a few typos.

  • Date in Unit Date should probably be Data
  • what is key referring to. Perhaps you mean Data

also, note that athena is case insensitive, and column names are converted to lower case (even if you quote them).

with that out of the way, you have to use the full expression that extracts your email from the json document in the where clause. the column alias defined is not accessible to the rest of the query.

here's a self contained example:

with test (unit, data) as (
values
('CH',  JSON '[{"from":"CH","when":"2021-02-16","who":"[email protected]"}]'),
('AB',  JSON '[{"from":"AB","when":"2020-02-16","who":"[email protected]"}]')
)
select json_extract_scalar(data, '$[0].who') email
from test
where json_extract_scalar(data, '$[0].who') = '[email protected]';

outputs: 
| email        |
+--------------+
| [email protected] |

Upvotes: 7

Related Questions