Dac0d3r
Dac0d3r

Reputation: 1854

PostgreSQL 9.4: Aggregate / Join table on JSON field id inside array

Suppose you have the following tables, say

table: content_type

table: data_type

This is just a tiny example to illustrate.

The json stored in tabs could be like eg:

[
    {
        "name": "tab1",
        "properties": 
        [{
            "name": "prop1",
            "order": 1,
            "help_text": "help text",
            "description": "description",
            "data_type": 1
        }]
    },
    {
        "name": "tab2",
        "properties":[{
            "name": "prop2",
            "order": 1,
            "help_text": "help text2",
            "description": "description2",
            "data_type": 1
        },
        {
            "name": "prop3",
            "order": 2,
            "help_text": "help text3",
            "description": "description3",
            "data_type": 1
        }]
    }
]

What I'm looking to achieve now is to make a join like the following pseudo code:

SELECT content_type.id, content_type.tabs, data_type.id, data_type.html
FROM content_type
JOIN data_type
ON data_type.id = content_type.tabs::json->'data_type'::int

Where data_type.id, data_type.html is actually joined to the tabs' properties' data_type and not as a seperate column like in the above usual join query.

Basically I'm looking for the same result as one would have if it was 2 or more tables joined together on column id's, except that one of the "tables" in this case is located inside an array of json objects.

And yes I know the above join attempt is very far off, since I want to "add/append" the additional attributes to a modified version of the tabs json column, not as additional seperate columns.

In this case the join would return - in the select/join query - the id inside the tabs json array and extend it with the join-fetched additional attributes, so instead of simply a "data_type": 1 it would return something like:

"data_type": {
    "id":1, 
    "html": "<input type='text' id='%v' class='%v' placeholder='%v' value='%v' ng-model='%v'>"
}

... and additional attributes of course. This is just a simple example to illustrate the problem.

[
    {
        "name": "tab1",
        "properties": 
        [{
            "name": "prop1",
            "order": 1,
            "help_text": "help text",
            "description": "description",
            "data_type": {
                "id":1, 
                "html": "<input type='text' id='%v' class='%v' placeholder='%v' value='%v' ng-model='%v'>"
                ... and of course additional columns fetched from the data_type table, and added to the select return in our join, to manipulate the original json array of tabs->properties->data_type
            }
        }]
    },
    {
        "name": "tab2",
        "properties":[{
            "name": "prop2",
            "order": 1... etc

Hope this makes sense and that you can help me with this, because I seem to be somewhat stuck.

Ps. Using the latest 9.4beta3 btw.

I found a link here that give me hopes that this is indeed possible to achieve with PostgreSQL: http://www.slideshare.net/EnterpriseDB/no-37327319 (see slide 17)

Other links that might be helpful:


I tried experimenting a bit - here's what I have so far:

SELECT content_type.id, content_type.tabs as original, gf.json_agg as new_tabs
FROM content_type,
LATERAL (
    select json_agg(row1) from((
    select y.name, ss.extended_properties
    from json_to_recordset(
        (
            select * 
            from json_to_recordset(
                (
                    SELECT json_agg(ggg)
                    from(
                        SELECT tabs
                        FROM 
                        (   
                            SELECT 
                            *
                            FROM content_type as ct
                            WHERE ct.id=content_type.id
                        ) dsfds
                    )ggg
                )
            ) as x(tabs json)
        )
    ) as y(name text, properties json),
    LATERAL (
        select json_agg(row) as extended_properties
        from(
            select name, "order", data_type, data_type.html as data_type_html, help_text, description
            from json_to_recordset(properties) 
            as k(name text, "order" int, data_type int, help_text text, description text)
            JOIN data_type
            ON data_type.id = k.data_type
            )row
    ) ss
    ))row1
) gf

which results in the following (zoom in in your browser to read the text in the image - hold ctrl + mwheel up og plus key on keyboard):

enter image description here

At least now I can put in the data_type.html in there, although I would have preferred "data_type": { "id": 1, "html": "[somehtmlhere]"}

For some reason it wont allow me to wrap json_agg around it and show you the output as 1 combined json document. Don't udnerstand why, but guess it has to do with LATERAL and it probably is a bug in PostgreSQL 9.4 beta3

I know there must be a way better approach and solution to this - I'm not very experienced with pgSQL or pg in general... yet.

Upvotes: 4

Views: 6808

Answers (1)

The basic idea is that your query should expect your json to be structured in a certain way, otherwise it gets really complex. Based on the expected structure, we are able to dissect the json structure into columns using json_to_recordset and to rebuild it with additional information using json_build_object and json_agg.

WITH tab_properties_with_expanded_data_type AS (
    SELECT
      content_type.id AS content_type_id,
      tab.name AS tab_name,
      json_agg(
          -- re-build the property object, but with additional data_type information
          json_build_object(
              'name', property.name,
              'order', property.order,
              'help_text', property.help_text,
              'description', property.description,
              'data_type', json_build_object('id', data_type.id, 'html', data_type.html)
          )
      ) AS tab_properties
    FROM content_type,
      json_to_recordset(content_type.tabs) AS tab(name TEXT, properties JSON),
      json_to_recordset(tab.properties) AS property(name TEXT, "order" INTEGER, help_text TEXT, description TEXT, data_type INTEGER)
      LEFT JOIN data_type ON data_type.id = property.data_type
    GROUP BY 
      content_type.id, 
      tab.name
)
SELECT
  tab_properties_with_expanded_data_type.content_type_id AS id, 
  json_agg(
      -- rebuild the tab object
      json_build_object(
          'name', tab_properties_with_expanded_data_type.tab_name,
          'properties', tab_properties_with_expanded_data_type.tab_properties
      )
  )
FROM tab_properties_with_expanded_data_type
GROUP BY 
  tab_properties_with_expanded_data_type.content_type_id

This works but is very limited in terms of flexibility: I have to explicitly list every field of your tabs and properties and I expect the document to have a specific structure. But it's a good start :)

Upvotes: 3

Related Questions