MrJibus
MrJibus

Reputation: 113

How to join on a nested value from a jsonb column?

I have a PostgreSQL 11 database with these tables:

CREATE TABLE stats (
   id integer NOT NULL,
   uid integer NOT NULL,
   date date NOT NULL,
   data jsonb DEFAULT '[]'::json NOT NULL
);
INSERT INTO stats(id, uid, date, data) VALUES
   (1, 1, '2020-10-01', '{"somerandomhash":{"source":"thesource"}}');

CREATE TABLE links(
   id integer NOT NULL,
   uuid uuid NOT NULL,
   path text NOT NULL
);
INSERT INTO links(id, uuid, path) VALUES
   (1, 'acbd18db-4cc2-f85c-edef-654fccc4a4d8', 'thesource');

My goal is to create a new table reports with data from the stats table, but with a new key from the links table. It will look like this:

CREATE TABLE reports(
    id integer NOT NULL,
    uid integer NOT NULL,
    date date NOT NULL,
    data jsonb DEFAULT '[]'::json NOT NULL
);

INSERT INTO reports(id, uid, date, data) VALUES
   (1, 1, 2020-10-01, {"uuid":{"source":"thesource"});

To this end, I tried to left join the table links in order to retrieve the uuid column value - without luck:

SELECT s.uid, s.date, s.data->jsonb_object_keys(data)->>'source' as path, s.data->jsonb_object_keys(data) as data, l.uuid
FROM stats s LEFT JOIN links l ON s.data->jsonb_object_keys(data)->>'source' = l.path

I tried to use the result of s.data->jsonb_object_keys(data)->>'source' in the left join, but got the error:

ERROR:  set-returning functions are not allowed in JOIN conditions

I tried using LATERAL but still not valid result.
How to make this work?

Upvotes: 2

Views: 2104

Answers (2)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 656942

jsonb_object_keys() is a set-returning function which cannot be used the way you do - as the error messages tells you. What's more, json_object_keys() returns top-level key(s), but it seems you are only interested in the value. Try jsonb_each() instead:

SELECT s.id
     , s.uid
     , s.date
     , jsonb_build_object(l.uuid::text, o.value) AS new_data
FROM   stats s
CROSS  JOIN LATERAL jsonb_each(s.data) o  -- defaults to column names (key, value)
LEFT   JOIN links l ON l.path = o.value->>'source';

fiddle

jsonb_each() returns top-level key and value. Proceed using only the value.

The nested JSON object seems to have the constant key name 'source'. So the join condition is l.path = o.value->>'source'.

Finally, build the new jsonb value with jsonb_build_object().

While this works as demonstrated, a couple of questions remain:

  • The above assumes there is always exactly one top-level key in stats.data. If not, you'd have to define what to do ...

  • The above assumes there is always exactly one match in table links. If not, you'd have to define what to do ...

  • Most importantly, if data is as regular as you make it out to be, consider a plain uuid column (or drop it as the value is in table links anyway) and a plain column source to replace the jsonb column. Much simpler and more efficient.

Upvotes: 2

Kamen Kanev
Kamen Kanev

Reputation: 311

It looks like that you want to join by the "source" key from the JSON column.

Instead of

s.data->jsonb_object_keys(data)->>'source'

Try this

s.data ->> 'source'

If my assumptions are correct the whole query can go like that:

SELECT
  s.uid,
  s.date,
  s.data ->> 'source' AS path,
  s.data -> jsonb_object_keys(data) AS data,
  l.uuid
FROM stats s
LEFT JOIN links l ON s.data ->> 'source' = l.path

Upvotes: 0

Related Questions