Reputation: 12900
I have the following tables in a PostgreSQL 9.5 server:
The notable structural thing is that location
is theoretically infinitely recursive. I need to generate a JSON message from a root location
, recursing into all sub-locations; every location
has some properties, an array of inventory
items and an array of child locations
.
How do I create a performant query for this? I am looking at various PostgreSQL JSON functions, the LATERAL keyword, CTEs, and getting a bit confused. I've done JSON output with non-recursive queries, but not sure how to handle recursion cleanly.
Here's an example output:
{
"id": 1000,
"name": "By Location",
"type": "SITE",
"locations": [
{
"id": 1005,
"name": "Storage A",
"type": "STOR",
"locations": [ ...(same schema as parent)... ],
"inventories": [ ...(see below for schema)... ]
},
{
"id": 1017,
"name": "Storage B",
"name": "COLD",
"locations": [ ...(same schema as parent)... ],
"inventories": [...(see below for schema)... ]
}
],
"inventories": [
{
"id": 5340,
"product_id": 9120,
"name": "Product X",
"thumb": "https://example.com/api/images/nnnn.jpg",
"sort_order": 1,
"par_level": 3.5,
"created": 1452898800,
"updated": 1453071600,
"measures": [
{"id": 3498, "quantity": 2.25, "created": 1453071600, "updated": 1453071600},
{"id": 3456, "quantity": 3.25, "created": 1452898800, "updated": 1452898800}
]
}
]
}
Upvotes: 4
Views: 1959
Reputation: 12900
I ended up creating three functions. Probably could have done less, but the functions would be reusable in other queries. Basically, any place in the JSON output that should have an array of values, that gets handled by a function that returns a recordset that gets json_agg()
'ed.
CREATE OR REPLACE FUNCTION get_measures_by_inventory_as_json(invid UUID, del TIMESTAMP WITH TIME ZONE DEFAULT now())
RETURNS TABLE(inventory_id UUID, measure_json JSON)
AS $$
-- returns a JSONified record per measure tied to an inventory record
SELECT m.inventory_id, json_build_object(
'id', m.id,
'quantity', m.quantity,
'read', TRUNC(EXTRACT(EPOCH FROM m.read_date)),
'created', TRUNC(EXTRACT(EPOCH FROM m.created)),
'updated', TRUNC(EXTRACT(EPOCH FROM m.updated)),
'deleted', TRUNC(EXTRACT(EPOCH FROM m.deleted))
)
FROM measure m
WHERE m.inventory_id = invid
AND (m.deleted >= del);
$$
LANGUAGE sql;
CREATE OR REPLACE FUNCTION get_inventories_by_location_as_json(locid UUID, del TIMESTAMP WITH TIME ZONE DEFAULT now())
RETURNS TABLE(location_id UUID, inventory_json JSON)
AS $$
-- returns a JSONified set of inventory items, with product info and measures, given a location
SELECT i.location_id, json_build_object(
'id', i.id,
'product_id', p.id,
'name', p.name,
'mass_quantity', p.mass_quantity,
'mass_unit', um.code,
'count_unit', uc.code,
'thumb', p.product_picture_uri,
'sort_order', i.sort_order,
'par_level', i.par_level,
'created', TRUNC(EXTRACT(EPOCH FROM i.created)),
'updated', TRUNC(EXTRACT(EPOCH FROM i.updated)),
'deleted', TRUNC(EXTRACT(EPOCH FROM i.deleted)),
'measures', COALESCE((SELECT json_agg(measure_json) FROM get_measures_by_inventory_as_json(i.id)), '[]')::json
)
FROM inventory i
INNER JOIN product p ON i.product_id = p.id
LEFT JOIN unit um ON p.mass_unit_id = um.id
LEFT JOIN unit uc ON p.count_unit_id = uc.id
WHERE i.location_id = locid
AND i.deleted >= del
AND p.deleted >= del;
$$
LANGUAGE sql;
CREATE OR REPLACE FUNCTION get_inventories_recursive_as_json(locid UUID[], del TIMESTAMP WITH TIME ZONE DEFAULT now())
RETURNS JSON
AS $$
-- returns JSONified location info and inventories in that location
-- and recurses into child locations, showing the same
SELECT json_agg(loc) FROM (
SELECT l.id, array_agg(c.id), json_build_object(
'id', l.id,
'name', l.name,
'type', t.code,
'locations', get_inventories_recursive_as_json(array_agg(c.id)),
'inventories', COALESCE((SELECT json_agg(inventory_json) FROM get_inventories_by_location_as_json(l.id)),'[]')::json
) AS loc
FROM location l
LEFT OUTER JOIN location c ON l.id = c.parent_id
INNER JOIN location_type t ON l.location_type_id = t.id
WHERE l.id = ANY(locid)
AND l.deleted >= del
GROUP BY l.id, l.name, t.code
) AS out;
$$
LANGUAGE sql;
Tried to do it via a CTE, which would have been so elegant, but was unable to figure out how to do it without running afoul the errors related to the inability to aggregate in the recurse.
WITH RECURSIVE locations AS (
WITH inventories AS (
WITH measures AS (
SELECT m.inventory_id, json_agg(json_build_object(
'id', m.id,
'quantity', m.quantity,
'read', TRUNC(EXTRACT(EPOCH FROM m.read_date)),
'created', TRUNC(EXTRACT(EPOCH FROM m.created)),
'updated', TRUNC(EXTRACT(EPOCH FROM m.updated)),
'deleted', TRUNC(EXTRACT(EPOCH FROM m.deleted))
)) as measures
FROM measure m
GROUP BY m.inventory_id
)
SELECT i.location_id, json_agg(json_build_object(
'id', i.id,
'product_id', p.id,
'name', p.name,
'mass_quantity', p.mass_quantity,
'mass_unit', um.code,
'count_unit', uc.code,
'thumb', p.product_picture_uri,
'sort_order', i.sort_order,
'par_level', i.par_level,
'created', TRUNC(EXTRACT(EPOCH FROM i.created)),
'updated', TRUNC(EXTRACT(EPOCH FROM i.updated)),
'deleted', TRUNC(EXTRACT(EPOCH FROM i.deleted)),
'measures', COALESCE(m.measures, '[]')
)) AS inventories
FROM inventory i
INNER JOIN product p ON i.product_id = p.id
LEFT JOIN unit um ON p.mass_unit_id = um.id
LEFT JOIN unit uc ON p.count_unit_id = uc.id
LEFT JOIN measures m ON i.id = m.inventory_id
GROUP BY i.location_id
)
SELECT null as id, null as name, null as type, null as inventories
FROM location l
INNER JOIN location_type t ON l.location_type_id = t.id
LEFT OUTER JOIN inventories i ON l.id = i.location_id
GROUP BY l.parent_id
/*UNION ALL
SELECT p.id, p.parent_id, p.name, t.code, COALESCE(i.inventories, '[]')::jsonb AS inventories, json_agg(row_to_json(c.*))
FROM location p
INNER JOIN location_type t ON p.location_type_id = t.id
LEFT OUTER JOIN inventories i ON p.id = i.location_id
INNER JOIN locations c ON p.id = c.parent_id
GROUP BY p.id, p.name, t.code, COALESCE(i.inventories, '[]')::jsonb*/
)
SELECT * FROM locations
Upvotes: 0
Reputation: 189
Let's break it into pieces. First, you will have nested subqueries to create the nested arrays. Common Table Expressions may help.
The other tricks are row_to_json and json_agg.
The first gotcha is that row_to_json needs the table as an argument to return the correct labels.
select json_agg(locations) from locations
Will return a json object for each row. To use only certain fields you will need to either create a type and cast to it or use a CTE and the syntax above. I'd use the CTE in most cases.
So you will end up with something like:
WITH lowlevel1 AS
(
SELECT a,
b,
c
FROM tab1) ,lowlevel2 AS
(
SELECT b,
c,
d
FROM tab2) ,midlevel1 AS
(
SELECT e,
f,
g,
json_agg(lowlevel1) AS lab1,
json_agg(lowlevel2) AS lab2
FROM tab3
LEFT OUTER JOIN lowlevel1
ON tab3.id = lowlevel1.parent
LEFT OUTER JOIN lowlevel2
ON tab3.id = lovlevel2.parent)
SELECT row_to_json(midlevel1) from midlevel1
or on the last line use json_agg(midlevel1)
instead row_to_json(midlevel1)
to return one array of all rows.
CTEs also support regression with the RECURSIVE
modifier. However, that returns one table with the results of the regression and not a nested JSON structure. As a result, you will probably need to explicitly code the desired levels of nesting.
If an element doesn't exist Postgres will return null. For example a list of sub-locations where there are none will return "locations":[null]
. To replace that with a more meaningful result the case when <> then <> else '[]' end
or if <> then <> else '[]' end
can be used. The first is the 'searched case' where each test is a Boolean expression.
Upvotes: 1