bobcat
bobcat

Reputation: 165

Fastest way to select min row with join

In this example, I have a listing of users (main_data), a pass list (pass_list) and a corresponding priority to each pass code type (pass_code). The query I am constructing is looking for a list of users and the corresponding pass code type with the lowest priority. The query below works but it just seems like there may be a faster way to construct it I am missing. SQL Fiddle: http://sqlfiddle.com/#!2/2ec8d/2/0 or see below for table details.

SELECT md.first_name, md.last_name, pl.* 
FROM main_data md
JOIN pass_list pl on pl.main_data_id = md.id 
AND
pl.id = 
  (
    SELECT pl2.id 
    FROM pass_list pl2 
    JOIN pass_code pc2 on pl2.pass_code_type = pc2.type 
    WHERE pl2.main_data_id = md.id 
    ORDER BY pc2.priority 
    LIMIT 1
  )

Results:

+------------+-----------+----+--------------+----------------+
| first_name | last_name | id | main_data_id | pass_code_type |
+------------+-----------+----+--------------+----------------+
| Bob        | Smith     |  1 |            1 | S              |
| Mary       | Vance     |  8 |            2 | M              |
| Margret    | Cough     |  5 |            3 | H              |
| Mark       | Johnson   |  9 |            4 | H              |
| Tim        | Allen     | 13 |            5 | M              |
+------------+-----------+----+--------------+----------------+

users (main_data)

+----+------------+-----------+
| id | first_name | last_name |
+----+------------+-----------+
|  1 | Bob        | Smith     |
|  2 | Mary       | Vance     |
|  3 | Margret    | Cough     |
|  4 | Mark       | Johnson   |
|  5 | Tim        | Allen     |
+----+------------+-----------+

pass list (pass_list)

+----+--------------+----------------+
| id | main_data_id | pass_code_type |
+----+--------------+----------------+
|  1 |            1 | S              |
|  3 |            2 | E              |
|  4 |            2 | H              |
|  5 |            3 | H              |
|  7 |            4 | E              |
|  8 |            2 | M              |
|  9 |            4 | H              |
| 10 |            4 | H              |
| 11 |            5 | S              |
| 12 |            3 | S              |
| 13 |            5 | M              |
| 14 |            1 | E              |
+----+--------------+----------------+

Table which specifies priority (pass_code)

+----+------+----------+
| id | type | priority |
+----+------+----------+
|  1 | M    |        1 |
|  2 | H    |        2 |
|  3 | S    |        3 |
|  4 | E    |        4 |
+----+------+----------+

Upvotes: 2

Views: 147

Answers (3)

sceaj
sceaj

Reputation: 1643

I'm not familiar with the special behavior of MySQL's group by, but my solution for these types of problems is to simply express as where there doesn't exist a row with a lower priority. This is standard SQL so should work on any DB.

select distinct u.id, u.first_name, u.last_name, pl.pass_code_type, pc.id, pc.priority 
from main_data u 
  inner join pass_list pl on pl.main_data_id = u.id
  inner join pass_code pc on pc.type = pl.pass_code_type
where not exists (select 1 
                  from pass_list pl2 
                    inner join pass_code pc2 on pc2.type = pl2.pass_code_type 
                  where pl2.main_data_id = u.id and pc2.priority < pc.priority);

How well this performs is going to depend on having the proper indexes (assuming that main_data and pass_list are somewhat large). In this case indexes on the primary (should be automatically created) and foreign keys should be sufficient. There may be other queries that are faster, I would start by comparing this to your query.

Also, I had to add distinct because you have duplicate rows in pass_list (id 9 & 10), but if you ensure that duplicates can't exist (unique index on main_data_id, pass_code_type) then you will save some time by removing the distinct which forces a final sort of the result set. This savings would be more noticeable the larger the result set is.

Upvotes: 0

Kickstart
Kickstart

Reputation: 21513

A version that will get the details as required, and should also work across different flavours of SQL

SELECT md.first_name, md.last_name, MinId, pl.main_data_id, pl.pass_code_type
FROM main_data md
INNER JOIN pass_list pl
ON md.id = pl.main_data_id
INNER JOIN pass_code pc
ON pl.pass_code_type = pc.type
INNER JOIN
(
    SELECT pl.main_data_id, pl.pass_code_type, Sub0.MinPriority, MIN(pl.id) AS MinId
    FROM pass_list pl
    INNER JOIN pass_code pc
    ON pl.pass_code_type = pc.type
    INNER JOIN
    (
        SELECT main_data_id, MIN(priority) AS MinPriority
        FROM pass_list a
        INNER JOIN pass_code b
        ON a.pass_code_type = b.type
        GROUP BY main_data_id
    ) Sub0
    ON pl.main_data_id = Sub0.main_data_id
    AND pc.priority = Sub0.MinPriority
    GROUP BY pl.main_data_id, pl.pass_code_type, Sub0.MinPriority
) Sub1
ON pl.main_data_id = Sub1.main_data_id
AND pl.id = Sub1.MinId
AND pc.priority = Sub1.MinPriority
ORDER BY pl.main_data_id

This does not rely on the flexibility of MySQLs GROUP BY functionality.

Upvotes: 0

Bohemian
Bohemian

Reputation: 424993

Due to mysql's unique extension to its GROUP BY, it's simple:

SELECT * FROM 
(SELECT md.first_name, md.last_name, pl.* 
FROM main_data md
JOIN pass_list pl on pl.main_data_id = md.id
ORDER BY pc2.priority) x
GROUP BY md.id

This returns only the first row encountered for each unique value of md.id, so by using an inner query to order the rows before applying the group by you get only the rows you want.

Upvotes: 1

Related Questions