danihodovic
danihodovic

Reputation: 1245

Subqueries: What am I doing fundamentally wrong?

I thought that selecting values from a subquery in SQL would only yield values from that subset until I found a very nasty bug in code. Here is an example of my problem.

  1. I'm selecting the rows that contain the latest(max) function by date. This correctly returns 4 rows with the latest check in of each function.

    select *, max(date) from cm where file_id == 5933 group by function_id;
    
file_id   function_id   date             value  max(date)
5933      64807         1407941297       1      1407941297
5933      64808         1407941297       11     1407941297
5933      895175        1306072348              1306072348
5933      895178        1363182349              1363182349 
  1. When selecting only the value from the subset above, it returns function values from previous dates, i.e. rows that don't belong in the subset above. You can see the result below where the dates are older than in the first subset.

    select temp.function_id, temp.date, temp.value
    from (select *, max(date)
          from cm
          where file_id 5933
          group by function_id) as temp;
    
function_id   date          value

64807         1306072348    1   <-outdated row, not in first subset
64808         1306072348    17  <-outdated row, not in first subset
895175        1306072348        
895178        1363182349

What am I doing fundamentally wrong? Shouldn't selects performed on subqueries only return possible results from those subqueries?

Upvotes: 0

Views: 70

Answers (3)

danihodovic
danihodovic

Reputation: 1245

Perhaps my question was not formulated correctly, but this post had the solutions I was essentially looking for:

https://stackoverflow.com/a/123481/2966951

https://stackoverflow.com/a/121435/2966951

Filtering out the most recent row was my problem. I was surprised that selecting from a subquery with a max value could yield anything other than that value.

Upvotes: 0

CL.
CL.

Reputation: 180010

SQLite allows you to use MAX() to select the row to be returned by a GROUP BY, but this works only if the MAX() is actually computed. When you throw the max(date) column away, this no longer works.

In this case, you actually want to use the date value, so you can just keep the MAX():

SELECT function_id,
       max(date) AS date,
       value
FROM cm
WHERE file_id = 5933
GROUP BY function_id

Upvotes: 1

user804018
user804018

Reputation:

You seem to be missing the fact that your subquery is returning ALL rows for the given file_id. If you want to restrict your subquery to recs with the most recent date, then you need to restrict it with a WHERE NOT EXISTS clause to check that no more recent records exist for the given condition.

Upvotes: 0

Related Questions