Reputation: 481
An earlier question covers how to use jOOQ's MULTISET_AGG
feature to order a query based on an aggregate function over a nested collection: use GROUP BY
to group the query by all the selected columns from the parent table, rather than trying to exploit implementation details of MULTISET
.
Is there an approach that works with multi-level collections? I'm using PostgreSQL but obviously if there's an engine-independent way to do it, even better.
For example, if I have a nested query using MULTISET
:
var results =
create
.select(
ORGANIZATIONS.ID,
ORGANIZATIONS.NAME,
multiset(
select(
DEPARTMENTS.ID,
DEPARTMENTS.NAME,
multiset(
select(EMPLOYEES.ID, EMPLOYEES.SALARY, EMPLOYEES.NAME)
.from(EMPLOYEES)
.where(EMPLOYEES.DEPARTMENT_ID.eq(DEPARTMENTS.ID)))))
.from(DEPARTMENTS)
.where(DEPARTMENTS.ORGANIZATION_ID.eq(ORGANIZATIONS.ID)))
.from(ORGANIZATIONS)
.fetch();
I can easily sort each department's employees by salary if I put an ORDER BY
in the innermost query.
multiset(
select(EMPLOYEES.ID, EMPLOYEES.SALARY, EMPLOYEES.NAME)
.from(EMPLOYEES)
.where(EMPLOYEES.DEPARTMENT_ID.eq(DEPARTMENTS.ID))
.orderBy(EMPLOYEES.SALARY.desc()))))
But how do I also sort the list of departments by the highest employee salary, and the list of organizations by the highest employee salary across its departments?
Changing the MULTISET
s into MULTISET_AGG
s and only doing the ORDER BY
on the top level of the query, as in the earlier question with just one level of nesting, seems like it can't work here because you'd need to group by two different sets of columns at the same time, and also because you can't nest aggregate functions (at least on PostgreSQL.)
Upvotes: 2
Views: 535
Reputation: 220987
I can't think of a simple way to aggregate these sums without accessing some tables twice. If I do, I'll update this answer. But here's a solution that answers your question:
var results = create
.select(
ORGANIZATIONS.ID,
ORGANIZATIONS.NAME,
multiset(
select(
DEPARTMENTS.ID,
DEPARTMENTS.NAME,
// Employees per department sorted by salary
multisetAgg(EMPLOYEES.ID, EMPLOYEES.SALARY, EMPLOYEES.NAME)
.orderBy(EMPLOYEES.SALARY.desc()))
.from(DEPARTMENTS)
.leftJoin(EMPLOYEES)
.on(EMPLOYEES.DEPARTMENT_ID.eq(DEPARTMENTS.ID))
.where(DEPARTMENTS.ORGANIZATION_ID.eq(ORGANIZATIONS.ID))
.groupBy(DEPARTMENTS.ID, DEPARTMENTS.NAME)
// Departments sorted by max employee salary
.orderBy(max(EMPLOYEES.SALARY).desc()))
.from(ORGANIZATIONS)
// Organisations sorted by max employee salary via extra join
.orderBy(field(
select(max(EMPLOYEES.SALARY))
.from(EMPLOYEES)
.where(EMPLOYEES.departments().ORGANIZATION_ID.eq(ORGANIZATION.ID))
).desc())
.fetch();
Another solution would be to sort the outer result only after querying:
var results = create
.select(
ORGANIZATIONS.ID,
ORGANIZATIONS.NAME,
multiset(
select(
DEPARTMENTS.ID,
DEPARTMENTS.NAME,
multisetAgg(EMPLOYEES.ID, EMPLOYEES.SALARY, EMPLOYEES.NAME)
.orderBy(EMPLOYEES.SALARY.desc()),
max(EMPLOYEES.SALARY))
.from(DEPARTMENTS)
.leftJoin(EMPLOYEES)
.on(EMPLOYEES.DEPARTMENT_ID.eq(DEPARTMENTS.ID))
.where(DEPARTMENTS.ORGANIZATION_ID.eq(ORGANIZATIONS.ID))
.groupBy(DEPARTMENTS.ID, DEPARTMENTS.NAME)
.orderBy(max(EMPLOYEES.SALARY).desc()))
.from(ORGANIZATIONS)
.fetch()
.sortDesc(Comparator.comparing(
// r.value3() is the outer multiset()
r -> r.value3().isEmpty()
? 0
// r.value3().get(0).value4() is the top max(EMPLOYEES.SALARY) value
: r.value3().get(0).value4()
));
It would be possible to define a new operator that extracts a value again from a MULTISET
, without depending on its SQL/XML or SQL/JSON serialisation implemetation. Or, you could roll your own using plain SQL templating, knowing what the serialisation implementation looks like (knowing that it might change subtly, in the future)
Upvotes: 1