Calcutta
Calcutta

Reputation: 1149

WHERE condition to filter JSON data in nested object or array

I have a table in a PostgreSQL database

CREATE TABLE js.orders (
    id serial NOT NULL PRIMARY KEY,
    info json NOT NULL
)

containing 6 rows of JSON data:

select * from js.orders;

returns

 id |                                                info                                                
----+----------------------------------------------------------------------------------------------------
  1 | { "customer": "Kapil", "items": {"product": "Heineken","qty": 6}}
  2 | { "customer": "Satyen", "items": {"product": "Heineken","qty": 18}}
  3 | { "customer": "Rekha", "items": {"product": "Carlsberg","qty": 24}}
  4 | { "customer": "Madhuri", "items": {"product": "Kalyani","qty": 12}}
  5 | { "customer": "Srinivas", "items": {"product": "Kingfisher Strong","qty": 12}}
  6 | { "customer": "Saina", "items": [{"product": "Bira91","qty": 6},{"product": "Kalyani","qty": 6} ]}
(6 rows)

The following 'select' query with a 'where' condition:

SELECT info ->> 'customer' AS customer FROM js.orders WHERE info -> 'items' ->> 'product' = 'Heineken';

returns the correct two rows:

customer 
----------
 Kapil
 Satyen
(2 rows)

However, the following query should return two rows:

SELECT info ->> 'customer' AS customer FROM js.orders WHERE info -> 'items' ->> 'product' = 'Kalyani';

but returns only one row:

    customer     
-----------------
 Madhuri
(1 row)

Obviously, the fact that the row with "customer" : "Sania" has a LIST of "items" instead of a single "items" is causing this problem. How should I modify my query so that the correct number of rows is returned. Or is it that the data has to be reformatted otherwise.

Platform is

'PostgreSQL 14.10 (Ubuntu 14.10-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit'

Upvotes: 2

Views: 130

Answers (1)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 658757

Your fist issue is that you store structured data as JSON documents instead of plain columns in a relational design, which would be much more efficient for storage and queries.

Your second issue is that you vary the structure of said JSON documents without need. Some have an object and some an array as value for the key "items". (None have a "list", which does not exist in JSON terminology.) Makes queries more complicated.

While stuck with your unfortunate design, this query with the jsonpath operator @? works, because it processes objects and arrays at the given path in default "lax" mode:

SELECT info ->> 'customer' AS customer
FROM   orders
WHERE  info -> 'items' @? '$.product ? (@ == "Kalyani")';

To only look at objects like your original query:

SELECT id, info ->> 'customer' AS customer
FROM   orders
WHERE  info -> 'items' @? 'strict $.product ? (@ == "Kalyani")';

fiddle

Requires Postgres 12 or later, where the SQL/JSON path language was added. And assumes jsonb as data type, not json. The former is typically preferable anyway.
You might want index support for that. See:

Upvotes: 2

Related Questions