Reputation: 378
Short version: I'd like to query the result of another query, in order to select a more limited result set. However, adding a where clause rewrites the first query rather than work on the results, so I don't get the answers I need.
The detail: I have two models, checks and ticks. Checks has_many ticks.
The first query uses DISTINCT ON and gathers all of the 'checks' and all of the related ticks but only returns the most recent tick. I have that working as a scope in the model.
In my controller,
def checklist
#Filter the results by scope or return all checks with latest tick
case params[:filter]
when "duebylastresult"
@checks = Check.mostrecenttickonly.duebylastresult
when "duebydate"
@checks = Check.mostrecenttickonly.duebydate
else
@checks = Check.mostrecenttickonly
end
end
In the model, the first scope (working):
scope :mostrecenttickonly, -> {
includes(:ticks)
.order("checks.id, ticks.created_at DESC")
.select("DISTINCT ON (checks.id) *").references(:ticks)
}
Generates the following SQL:
Parameters: {"filter"=>""}
SQL (1.0ms) SELECT DISTINCT ON (checks.id) *,
"checks"."id" AS t0_r0,
"checks"."area" AS t0_r1, "checks"."frequency" AS t0_r2,
"checks"."showinadvance" AS t0_r3, "checks"."category" AS t0_r4,
"checks"."title" AS t0_r5, "checks"."description" AS t0_r6,
"checks"."created_at" AS t0_r7, "checks"."updated_at" AS t0_r8,
"ticks"."id" AS t1_r0, "ticks"."result" AS t1_r1,
"ticks"."comments" AS t1_r2, "ticks"."created_at" AS t1_r3,
"ticks"."updated_at" AS t1_r4, "ticks"."check_id" AS t1_r5
FROM "checks" LEFT OUTER JOIN "ticks"
ON "ticks"."check_id" = "checks"."id"
ORDER BY checks.id, ticks.created_at DESC
Having got that result, I want to show only the ticks that have a value equal or greater than 3, so the scope:
scope :duebylastresult, -> { where("ticks.result >= 3") }
Generates the SQL
Parameters: {"filter"=>"duebylastresult"}
SQL (1.0ms) SELECT DISTINCT ON (checks.id) *,
"checks"."id" AS t0_r0,
"checks"."area" AS t0_r1, "checks"."frequency" AS t0_r2,
"checks"."showinadvance" AS t0_r3, "checks"."category" AS t0_r4,
"checks"."title" AS t0_r5, "checks"."description" AS t0_r6,
"checks"."created_at" AS t0_r7, "checks"."updated_at" AS t0_r8,
"ticks"."id" AS t1_r0, "ticks"."result" AS t1_r1,
"ticks"."comments" AS t1_r2, "ticks"."created_at" AS t1_r3,
"ticks"."updated_at" AS t1_r4, "ticks"."check_id" AS t1_r5
FROM "checks" LEFT OUTER JOIN "ticks"
ON "ticks"."check_id" = "checks"."id"
WHERE (ticks.result >= 3)
ORDER BY checks.id, ticks.created_at DESC
As best I can tell, the WHERE statement is acting before the DISTINCT ON clause, so I now have the 'latest tick where the result is >= 3', whilst I'm looking for 'latest tick THEN only where the result is >= 3'.
Hope that makes sense & Thanks in advance!
Edit - Example of what I get and what I need:
The Data:
Table Checks:
ID: 98 Title: Eire
ID: 99 Title: Land
Table Ticks:
ID: 1 CheckID: 98 Result:1 Date: Jan12
ID: 2 CheckID: 98 Result:5 Date: Feb12
ID: 3 CheckID: 98 Result:1 Date: Mar12
ID: 4 CheckID: 99 Result:4 Date: Apr12
First query returns the most recent result, like;
Check.ID: 98 Tick.ID: 3 Tick.Result: 1 Tick.Date: Mar12
Check.ID: 99 Tick.ID: 4 Tick.Result: 4 Tick.Date: Apr12
Second query currently returns the most recent result where the result is =>3, like;
Check.ID: 98 Tick.ID: 2 Tick.Result: 5 Tick.Date: Feb12
Check.ID: 99 Tick.ID: 4 Tick.Result: 5 Tick.Date: Apr12
When I really want:
Check.ID: 99 Tick.ID: 4 Tick.Result: 5 Tick.Date: Apr12
(ID 98 doesn't show as the last Tick.Result is 1).
Upvotes: 4
Views: 192
Reputation: 5120
I'm not sure I really understand the point of the :mostrecenttickonly
scope since you're just loading the checks.
That being said, if you want to get only those checks whose most recent ticks have a result greater than three, I think the best way to do that would be a window function:
check.rb
...
scope :duebylastresult, -> {
find_by_sql(
'SELECT *
FROM (SELECT checks.*,
ticks.id AS tick_ids,
ticks.date AS tick_date,
ticks.result AS tick_result,
dense_rank() OVER (
PARTITION BY checks.id
ORDER BY ticks.date DESC
) AS tick_rank
FROM checks
LEFT OUTER JOIN ticks ON checks.id = ticks.check_id) AS ranked_ticks
WHERE tick_rank = 1 AND tick_result >= 3;'
)
}
...
Basically, we're just joining everything in the checks and ticks tables, then adding another attribute called tick_rank
that is ranking each row in the result set according to its date
versus the other rows with the same checks.id
value.
The way SQL works is that the predicates (the conditions in the WHERE
clause) are evaluated prior to the evaluation of the SELECT
fields, meaning we can't just write tick_rank = 1
in this statement.
So we have to go the extra step of wrapping the results (which we alias as ranked_ticks
) and then just select everything and apply the predicates we want to this outer select statement. The tick_rank
has to be 1
, meaning it's the most recent tick
, and the result has to be >= 3.
edit: I was using that article I linked as a refresher since I often forget SQL syntax, but after looking at it, I think this would be somewhat more performant (basically just wait to join checks
until after the partitioning is done, that way I believe it will do fewer full scans):
scope :duebylastresult, -> {
find_by_sql(
'SELECT *
FROM checks
LEFT OUTER JOIN
(SELECT id AS tick_id,
check_id AS check_id,
date AS tick_date,
result AS tick_result,
dense_rank() OVER (
PARTITION BY ticks.check_id
ORDER BY ticks.date DESC
) AS tick_rank
FROM ticks) AS ranked_ticks ON checks.id = ranked_ticks.check_id
WHERE tick_rank = 1 AND tick_result >= 3;'
)
}
Upvotes: 0
Reputation: 2012
Could you try the following to see if it starts you in the right direction:
scope :just_a_test, -> {
includes(:ticks)
.order("checks.id")
.where("ticks.created_at = (SELECT MAX(ticks.created_at) FROM ticks WHERE ticks.check_id = checks.id)")
.where("ticks.result >= 3")
.group("checks.id")
}
Upvotes: 1