Reputation: 3024
What's the preferred way of selecting an element from the "group" part of a groupBy? (groupBy resulting in a [key -> group] relation). If possible it should result in one db hit, or at least a constant amount of hits.
As an example, let's say we have a Locations(id) and a Companies(id, locationId, foundedDate) tables. How could we find the oldest company for each location?
I would have done it this way:
Locations.join(Companies).on(_.id === _.locationId).groupBy(_._1).flatMap {
case (location, companies) =>
companies.map(_._2).sortBy(_.foundedDate).take(1).map {
company => (location, company)
}
}
but this generates a runtime exception:
scala.slick.SlickException: Unsupported query shape containing .groupBy without subsequent .map
at scala.slick.compiler.FuseComprehensions.fuse(Relational.scala:200)
...
Upvotes: 1
Views: 2046
Reputation: 12563
Slick isn't producing the most optimal code for such queries, also if you write this query in e.g. MySQL it might be heavy (and potentially ambiguous if there are several companies at the same location which were founded at the same date). So the simpler solution might be selecting companies separately, grouping them by location and sorting by date outside of the database, and then matching them with the locations.
val locations = Locations.list.map(location => (l.id,location)).asMap
val lcPairs = Companies.list
.group(_.locationId)
.map{ case (locationId, companies) => (
locations(id),companies.sort(_.foundedDate).head
)}
Upvotes: 0
Reputation: 102
I'm not familiar with Scala or Slick, but if it is using SQL under the hood, this won't work. In most implementations, the elements returned by the query either need to be one of the category values (location and company in this case) or an aggregate function.
This might work...
SELECT location.id, company.id, MAX(foundedDate)
FROM companies INNER JOIN locations ON (locations.id = companies.locationId)
GROUP BY location.id, company.id
ORDER BY location.id, MAX(foundedDate)
... but that's functionally equivalent to doing a sort with no group by:
SELECT location.id, company.id, foundedDate
FROM companies INNER JOIN locations ON (locations.id = companies.locationId)
ORDER BY location.id, foundedDate
To find the oldest companies without any need to filter the results, we could use a correlated subquery:
SELECT location.id, company.id, foundedDate
FROM companies INNER JOIN locations ON (locations.id = companies.locationId)
WHERE foundedDate = (SELECT MAX(foundedDate from companies c2 where c2.locationId = location.id)
This could return multiple companies by location, if they have the same founding date.
How any of that maps back to Scala/Slick, I can't say, but hopefully it helps a bit.
Upvotes: 1