Reputation: 11
I have a postgres database of jobs, locations and countries, where jobs <=> locations
is many-to-many, jobs <=> countries
is many-to-many, and locations <= countries
is one-to-many. A job can belong to a country through 2 different paths:
jobs <=> jobs_locations <=> locations <=> countries
jobs <=> jobs_countries <=> countries
This is my current query:
select
jobs.*,
COALESCE(json_agg(locations) filter (where locations.id IS NOT NULL), '[]') as locations,
COALESCE(json_agg(countries) filter (where countries.id IS NOT NULL), '[]') as countries
from jobs
left join jobs_locations on jobs.id = jobs_locations.job_id
left join locations on jobs_locations.location_id = locations.id
left join countries loc_country on locations.country_id = loc_country.id
left join jobs_countries on jobs.id = jobs_countries.job_id
left join countries on jobs_countries.country_id = countries.id
where loc_country.id = 2 OR countries.id = 2
group by jobs.id;
I'm currently getting the data in this format:
{
id: 1,
...
locations: [
{
id: 10,
name: "Paris",
country_id: 100
}
]
}
I want to get a job in this format:
{
id: 1,
...
locations: [
{
id: 10,
name: "Paris",
country: {
id: 100,
name: "France"
}
}
]
}
How can I do this? It seems like a simple query but I've been trying for hours to get this data in the right format, what am I missing here?
Upvotes: 1
Views: 46
Reputation: 658947
Many ways lead to Rome.
This returns all jobs, and only appends locations and/or countries for country_id = 2
. Should be among the fastest ways:
SELECT j.*
, coalesce(l.locations, '[]') AS locations
, coalesce(c.countries, '[]') AS countries
FROM jobs j
LEFT JOIN (
SELECT jl.job_id AS id
, jsonb_agg(l) - 'country_id' AS locations -- !!!
FROM jobs_locations jl
JOIN (
SELECT l.*, to_jsonb(c) AS country -- subquery to prepare row !!!
FROM locations l
JOIN countries c ON c.id = l.country_id
WHERE l.country_id = 2
) l ON l.id = jl.location_id
GROUP BY 1
) l USING (id)
LEFT JOIN (
SELECT jc.job_id AS id
, jsonb_agg(c) AS countries
FROM jobs_countries jc
JOIN countries c ON c.id = jc.country_id
WHERE jc.country_id = 2
GROUP BY 1
) c USING (id);
First of, your original seems incorrect to begin with. Multiple uncorrelated LEFT JOIN
instances result in multiplication of rows. See:
And then you add a WHERE
condition on "right" tables, which counteracts all the LEFT JOIN
clauses ... See:
I fixed that before adding an answer to your question. Note how I use jsonb
instead of json
to allow convenient operators. Details for that solution:
One side-effect: keys appear in jsonb
-specific sort order.
At least one of the of the joins must find a row. (That's how I understand your comment.)
You could simply append your (adapted) original WHERE
clause in the outer SELECT
:
...
WHERE 2 IN (l.id, c.id);
But try this more elegant rewrite with a FULL JOIN
between locations and countries, followed by a JOIN
to jobs
. If the subqueries are selective (like it seems), that should be (much) faster because it can filter jobs right away, during the join on the PK column id
:
SELECT j.*
, coalesce(l.locations, '[]') AS locations
, coalesce(c.countries, '[]') AS countries
FROM (
SELECT jl.job_id AS id
, jsonb_agg(l) - 'country_id' AS locations
FROM jobs_locations jl
JOIN (
SELECT l.*, to_jsonb(c) AS country
FROM locations l
JOIN countries c ON c.id = l.country_id
WHERE l.country_id = 2
) l ON l.id = jl.location_id
GROUP BY 1
) l
FULL JOIN (
SELECT jc.job_id AS id
, jsonb_agg(c) AS countries
FROM jobs_countries jc
JOIN countries c ON c.id = jc.country_id
WHERE jc.country_id = 2
GROUP BY 1
) c USING (id)
JOIN jobs j USING (id);
You should be comfortable with all join clauses and NULL
handling to understand subtleties. In particular, matching column aliases (id
) and the USING
clause in the FULL [OUTER] JOIN
are essential to produce the single, combined id
column for the subsequent [INNER] JOIN
. This is one of the rare cases where the USING
clause is essential.
(Well, we could spell out JOIN jobs j ON j.id = COALESCE(l.id, c.id)
but that's far less elegant.)
Upvotes: 1