SQL query get lowest value from related record, subquery

Question

I have the following tables in a mysql database:

mechanics: id, name, distance
mechanic_zones: mechanic_id, zone, radius

The mechanics table records the mechanics, id, name, and their distance from a job (this is actually calculated using the postcode of the mechanic and the postcode of the job but i've simplified this for sake of clarity)

The mechanic_zones table allows the mechanics to define radii for their zones and is used to price a job

mechanics:
1, Jon, 5.4
2, Paul, 6.5
3, George, 20


mechanic_zones:
1, a, 5
1, b, 10
1, c, 20
2, a, 10
2, b, 20
2, c, 50
3, a, 5
3, b, 10
3, c, 15

Jon has his zones defined as: a - 5 miles, b - 10 miles, and c - 20 miles.

Paul has his zones defined as: a - 10 miles, b - 20 miles and c - 50 miles.

George has his zones defined as: a - 5 miles, b - 10 miles, and c - 15 miles.

I want to be able to find the lowest zone for a mechanic for a job. In the example Jon is 5.4 miles from the job, Paul is 6.5, and George 20.

So the query should return something like:

mechanic_id, name, zone, distance
2, Paul, A, 6.5
1, Jon, B, 5.4

The job is in Paul's A zones because at 6.5 miles its within his 10 miles defined as zone A.

Its in Jon's B zone because its greater than his 5 miles A zone but less than his 10 miles limit for his B zone.

It's out of zone for George as its greater than his 20 miles C zone.

This is as far as I've got:

SELECT id, name, (distance * 1) as distance_to_job, min(mz.`zone`) as min_zone, min(mz.radius) as min_radius, max(mz.`zone`) as max_zone, max(mz.radius) as max_radius
FROM mechanics m, mechanic_zones mz 
WHERE m.id = mz.mechanic_id 
GROUP BY m.id, postcode 
HAVING distance_to_job < max_radius 
ORDER BY distance_to_job ASC, radius ASC

Which (i think) gives me all the mechanics that are within zone but doesn't actually figure out which zone the distance is in.

Any help much appreciated

GarethD · Accepted Answer

You need to use an additional subquery to find out what the minimum radius is per mechanic (where the radius is greater than the distance), and then you can join this back to your two tables and get all the column information you need from the two tables:

SELECT  m.ID, mz.Zone, m.distance, mz.radius
FROM    Mechanics m
        INNER JOIN mechanic_zones mz
            ON mz.Mechanic_ID = m.ID
        INNER JOIN
        (   SELECT  m.ID, 
                    MIN(mz.radius) AS radius
            FROM    Mechanics m
                    INNER JOIN mechanic_zones mz
                        ON mz.Mechanic_ID = m.ID
            WHERE   mz.radius > M.distance
            GROUP BY m.ID
        ) MinZone
            ON MinZone.ID = m.ID
            AND MinZone.radius= mz.radius
ORDER BY mz.Zone;

Example on SQL Fiddle

If you don't actually want to know the radius of the selected zone, and the zone with the lowest radius will always have the lowest letter you can just use:

SELECT  m.ID, mz.MinZone, m.distance
FROM    Mechanics m
        INNER JOIN
        (   SELECT  m.ID, 
                    MIN(mz.Zone) AS Zone
            FROM    Mechanics m
                    INNER JOIN mechanic_zones mz
                        ON mz.Mechanic_ID = m.ID
            WHERE   mz.radius > M.distance
            GROUP BY m.ID
        ) MinZone
            ON MinZone.ID = m.ID
ORDER BY MinZone.Zone;

Example on SQL Fiddle

EDIT

Your fiddle is very close to what I would use, but I would use the following so that the calculation is only done once:

SELECT  m.id, m.name, m.distance, m.radius, m.zone
FROM    (   SELECT  m.ID, 
                    m.Name,
                    m.Distance,
                    MIN(mz.radius) AS radius
            FROM    (   SELECT  ID, Name, (1 * Distance) AS Distance
                        FROM    Mechanics 
                    ) m
                    INNER JOIN mechanic_zones mz
                        ON mz.Mechanic_ID = m.ID
            WHERE   mz.radius > M.distance
            GROUP BY m.ID, m.Name, m.Distance
        ) m
        INNER JOIN  mechanic_zones mz
            ON mz.Mechanic_ID = m.ID
            AND mz.radius = m.radius;

Example on SQL Fiddle

The reasoning behind this that your query has columns in the select list and not in a group by, so there is no guarantee that the radius returned will be lowest one. For example if you change the order in which the records are inserted to mechanic_zones (as in this fiddle) you results become:

ID  NAME    DTJ     RADIUS  ZONE
1   Jon     2       10      a
2   Paul    11      50      b
3   George  5       5       a

Instead of

ID  NAME    DTJ     RADIUS  ZONE
1   Jon     2       5       a
2   Paul    11      20      b
3   George  5       5       a

As you can see the radius for Jon is wrong. To explain this further below is an extract of an explanation I have written about the short comings of MySQL's implentation of implicit grouping.

I would advise to avoid the implicit grouping offered by MySQL where possible, by this i mean including columns in the select list, even though they are not contained in an aggregate function or the group by clause.

Imagine the following simple table (T):

ID  | Column1 | Column2  |
----|---------+----------|
1   |    A    |    X     |
2   |    A    |    Y     |

In MySQL you can write

SELECT  ID, Column1, Column2
FROM    T
GROUP BY Column1;

This actually breaks the SQL Standard, but it works in MySQL, however the trouble is it is non-deterministic, the result:

ID  | Column1 | Column2  |
----|---------+----------|
1   |    A    |    X     |

Is no more or less correct than

ID  | Column1 | Column2  |  
----|---------+----------|
2   |    A    |    Y     |

So what you are saying is give me one row for each distinct value of Column1, which both results sets satisfy, so how do you know which one you will get? Well you don't, it seems to be a fairly popular misconception that you can add and ORDER BY clause to influence the results, so for example the following query:

SELECT  ID, Column1, Column2
FROM    T
GROUP BY Column1
ORDER BY ID DESC;

Would ensure that you get the following result:

ID  | Column1 | Column2  |  
----|---------+----------|
2   |    A    |    Y     |

because of the ORDER BY ID DESC, however this is not true (as demonstrated here).

The MMySQL documents state:

The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause.

So even though you have an order by this does not apply until after one row per group has been selected, and this one row is non-determistic.

The SQL-Standard does allow columns in the select list not contained in the GROUP BY or an aggregate function, however these columns must be functionally dependant on a column in the GROUP BY. For example, ID in the sample table is the PRIMARY KEY, so we know it is unique in the table, so the following query conforms to the SQL standard and would run in MySQL and fail in many DBMS currently (At the time of writing Postgresql is the closest DBMS I know of to correctly implementing the standard):

SELECT  ID, Column1, Column2
FROM    T
GROUP BY ID;

Since ID is unique for each row, there can only be one value of Column1 for each ID, one value of Column2 there is no ambiguity about what to return for each row.

SQL query get lowest value from related record, subquery

Answers (1)

Related Questions