Why no similar ids in the results set when query with a correlated query inside where clause

Question

I have a table with columns id, forename, surname, created (date).

I have a table such as the following:

ID | Forename | Surname | Created
---------------------------------
 1 | Tom      | Smith   | 2008-01-01
 1 | Tom      | Windsor | 2008-02-01
 2 | Anne     | Thorn   | 2008-01-05
 2 | Anne     | Baker   | 2008-03-01
 3 | Bill     | Sykes   | 2008-01-20

Basically, I want this to return the most recent name for each ID, so it would return:

ID | Forename | Surname | Created
---------------------------------
 1 | Tom      | Windsor | 2008-02-01
 2 | Anne     | Baker   | 2008-03-01
 3 | Bill     | Sykes   | 2008-01-20

I get the desired result with this query.

SELECT id, forename, surname, created
FROM name n
WHERE created = (SELECT MAX(created)
                              FROM name
                              GROUP BY id
                              HAVING id = n.id);

I am getting the result I want but I fail to understand WHY THE IDS ARE NOT BEING REPEATED in the result set. What I understand about correlated subquery is it takes one row from the outer query table and run the inner subquery. Shouldn't it repeat "id" when ids repeat in the outer query? Can someone explain to me what exactly is happening behind the scenes?

Gordon Linoff · Accepted Answer

First, your subquery does not need a GROUP BY. It is more commonly written as:

SELECT n.id, n.forename, n.surname, n.created
FROM name n
WHERE n.created = (SELECT MAX(n2.created)
                   FROM name n2
                   WHERE n2.id = n.id
                  );

You should get in the habit of qualifying all column references, especially when your query has multiple table references.

I think you are asking why this works. Well, each row in the outer query is tested for the condition. The condition is: "is my created the same as the maximum created for all rows in the name table with the same id". In your data, only one row per id matches that condition, so ids are not repeated.

Why no similar ids in the results set when query with a correlated query inside where clause

Answers (2)

Related Questions