Reputation: 3028
I have a PostgreSQL
database from which I query data using node.js
with the pg
module. The resulting array in my Javascript code (on the server side) contains lots of duplicate values because that's the way the data is returned. I believe the query is ok, so the values are duplicated because of multiple INNER JOINS
with tables that are independent of each other, so that a result of A
and B
from one table (qa_layers
in this case) with a result of C
and D
from another table (qa_cases
in this case), together with a common value E
from a third table (qa_settings
in this case )will be represented as
| settings_name | layer_name | case_name |
| E | A | C |
| E | A | D |
| E | B | C |
| E | B | D |
As far as I know, this should be the expected behavior. What I need to do is to filter out the unique data for each column as arrays with unique values, e.g. [A, B]
and [C, D]
for layer_name
and case_name
. In my mind there could be several ways to achieve this, but each of them seems a little painful, unless there is some helper library or functionality I don't know about.
PostgreSQL
query so that no or less parsing is needed. I don't know if this is possible to do, but if the resulting array had easier parsable objects in them the parsing could be less painful. My query is pasted at the end.My PostgreSQL query:
SELECT run.id, settings.name AS settings_name, layer.name AS layer_name, qa_case.name AS case_name FROM qa_runs AS run
INNER JOIN qa_composites_in_run AS cr
ON cr.run_id = run.id
INNER JOIN qa_layers_in_composite AS lc
ON lc.composite_name = cr.composite_name
INNER JOIN qa_layers AS layer
ON layer.name = lc.layer_name
INNER JOIN qa_cases_in_run AS case_run
ON case_run.run_id = run.id
INNER JOIN qa_cases AS qa_case
ON qa_case.name = case_run.case_name
INNER JOIN qa_settings AS settings
ON settings.name = run.settings_name
WHERE run.id IN (27,28,29);
My actual resulting Javascript array:
[
{
"id": 29,
"settings_name": "Test Default",
"layer_name": "OpenStreetMapService",
"case_name": "VisitLondon"
},
{
"id": 29,
"settings_name": "Test Default",
"layer_name": "OpenStreetMapService",
"case_name": "VisitRotterdam"
},
{
"id": 29,
"settings_name": "Test Default",
"layer_name": "OpenStreetMapService",
"case_name": "wtf"
},
{
"id": 29,
"settings_name": "Test Default",
"layer_name": "OpenStreetMapService",
"case_name": "VisitLondon"
},
{
"id": 29,
"settings_name": "Test Default",
"layer_name": "OpenStreetMapService",
"case_name": "VisitRotterdam"
},
{
"id": 29,
"settings_name": "Test Default",
"layer_name": "OpenStreetMapService",
"case_name": "wtf"
},
{
"id": 28,
"settings_name": "QA Default",
"layer_name": "OpenStreetMapService",
"case_name": "VisitLondon"
},
{
"id": 28,
"settings_name": "QA Default",
"layer_name": "OpenStreetMapService",
"case_name": "wtf"
},
{
"id": 27,
"settings_name": "QA Default",
"layer_name": "OpenStreetMapService",
"case_name": "VisitLondon"
},
{
"id": 27,
"settings_name": "QA Default",
"layer_name": "OpenStreetMapService",
"case_name": "VisitRotterdam"
},
{
"id": 29,
"settings_name": "Test Default",
"layer_name": "Map2D",
"case_name": "VisitLondon"
},
{
"id": 29,
"settings_name": "Test Default",
"layer_name": "Map2D",
"case_name": "VisitRotterdam"
},
{
"id": 29,
"settings_name": "Test Default",
"layer_name": "Map2D",
"case_name": "wtf"
},
{
"id": 29,
"settings_name": "Test Default",
"layer_name": "Map2D",
"case_name": "VisitLondon"
},
{
"id": 29,
"settings_name": "Test Default",
"layer_name": "Map2D",
"case_name": "VisitRotterdam"
},
{
"id": 29,
"settings_name": "Test Default",
"layer_name": "Map2D",
"case_name": "wtf"
},
{
"id": 28,
"settings_name": "QA Default",
"layer_name": "Map2D",
"case_name": "VisitLondon"
},
{
"id": 28,
"settings_name": "QA Default",
"layer_name": "Map2D",
"case_name": "wtf"
},
{
"id": 27,
"settings_name": "QA Default",
"layer_name": "Map2D",
"case_name": "VisitLondon"
},
{
"id": 27,
"settings_name": "QA Default",
"layer_name": "Map2D",
"case_name": "VisitRotterdam"
}
]
How I want the array/JSON to be in the end:
[
{
"id": 27,
"settings_name": "QA Default",
"layer_names": [
{
"layer_name": "OpenStreetMapService"
},
{
"layer_name": "Map2D"
}
],
"case_names": [
{
"case_name": "VisitLondon"
},
{
"case_name": "VisitRotterdam"
}
]
},
{
"id": 28,
"settings_name": "QA Default",
"layer_names": [
{
"layer_name": "OpenStreetMapService"
},
{
"layer_name": "Map2D"
}
],
"case_names": [
{
"case_name": "VisitLondon"
},
{
"case_name": "wtf"
}
]
},
{
"id": 29,
"settings_name": "Test Default",
"layer_names": [
{
"layer_name": "OpenStreetMapService"
},
{
"layer_name": "Map2D"
}
],
"case_names": [
{
"case_name": "VisitLondon"
},
{
"case_name": "VisitRotterdam"
},
{
"case_name": "wtf"
}
]
}
]
Upvotes: 0
Views: 694
Reputation: 324551
I suggest doing this on the PostgreSQL side, by constructing and returning a json object with your nested data instead of returning a join relation.
You can do this fairly easily with the json support in 9.3 and above. Without sample data and schema I can't really convert your query for you, but you'll want to start with json_agg
and row_to_json
. Have a look for other related questions with answers referencing those functions too.
Here's a simple example:
CREATE TABLE parent(
id integer primary key,
parentdata text
);
CREATE TABLE child(
id integer primary key,
parent_id integer not null references parent(id),
childdata text
);
INSERT INTO parent(id, parentdata) VALUES
(1, 'p1'), (2, 'p2'), (3, 'p3');
INSERT INTO child(id, parent_id, childdata)
VALUES
(10, 1, 'c1_10'),
(20, 2, 'c2_20'),
(21, 2, 'c2_21');
SELECT row_to_json(toplevel, true)
FROM (
SELECT p.id, p.parentdata, json_agg(c) AS children
FROM parent p
LEFT OUTER JOIN child c ON (p.id = c.parent_id) GROUP BY p.id
) toplevel;
which emits:
{"id":1,
"parentdata":"p1",
"child":[{"id":10,"parent_id":1,"childdata":"c1_10"}]}
{"id":2,
"parentdata":"p2",
"child":[{"id":20,"parent_id":2,"childdata":"c2_20"},
{"id":21,"parent_id":2,"childdata":"c2_21"}]}
{"id":3,
"parentdata":"p3",
"children":[null]}
If you had to de-duplicate, you'd probably want to do so by ensuring that your ORDER BY
clause in the SQL was fully specified, so that the rows were ordered from outer to inner object. Then you could scan the table linearly in js and ignore rows where the id for an object is the same as the id for the same entity in the previous row. It's a pretty simple single for loop with some state tracking variables. The downside is that PostgreSQL has to sort the result set.
Upvotes: 1