Andrew Breunig
Andrew Breunig

Reputation: 155

MySQL LEAST() with arbitrary number of parameters; longest match in a table

I would like to create a MySQL query to find the longest match (of a given ip address in quad-dotted format) that is present in a table of subnets.

Ultimately, I'd like to create a LEFT JOIN that will display every quad-dotted ip address in one table joined with their longest matches in another table. I don't want to create any temporary tables or structure it as a nested query.

I'm somewhat of a MySQL newbie, but what I'm thinking is something like this:

SELECT `ip_address`
  LEFT JOIN ON 
    SELECT `subnet_id`
    FROM `subnets_table`
    WHERE (`maximum_ip_value` - `minimum_ip_value`) =
    LEAST(<list of subnet intervals>)
      WHERE INET_ATON(<given ip address>) > `minimum_ip_value`
      AND INET_ATON(<given ip address>) < `maximum_ip_value`;

Such that minimum_ip_value and maximum_ip_value are the lowest and highest decimal-formatted ip addresses possible in a given subnet-- e.g., for the subnet 172.16.0.0/16:

minimum_ip_value = 2886729728 (or 172.16.0.0)
maximum_ip_value = 2886795263 (or 172.16.255.255)

And <list of subnet intervals> contains all intervals in subnets_table where <given ip address> is between minimum_ip_value and maximum_ip_value

And if more than one interval contains <given ip address>, then the smallest interval (i.e., smallest subnet, or most specific and "longest" match) is joined.

Ultimately, all I really want is the subnet_id value that corresponds with that interval.

So my questions are:

1) Can I use the LEAST() function with an arbitrary number of parameters? I'd like to compare every row of subnets_table, or more specifically, every row's interval between minimum_ip_value and maximum_ip_value, and select the smallest interval.

2) Can I perform all of this computation within a LEFT JOIN query? I'm fine with any suggestions that will be fast, encapsulated, and avoid repetitive queries of the same data.

I'm wondering if this is even possible to perform in a single query (i.e., without querying the subnets table for each ip address), but I don't know enough to rule it out. Please advise if this looks like it won't work, so I can try another angle.

Thanks.

Upvotes: 0

Views: 86

Answers (1)

Andrew Breunig
Andrew Breunig

Reputation: 155

After some research and trial & error, I see that there are a few issues with the prototype query above:

The LEAST() function takes only a set number of arguments. As per my original question, I want a function that will work on an arbitrary number of arguments, or every row in a table. That is a different function in MySQL, MIN().

The function MIN() has a lower precedence than the JOIN functions in MySQL, and is evaluated after the JOIN functions in any given query. Therefore, I can't JOIN on the MIN() of a set of values, because the MIN() doesn't exist yet at the time the JOIN is performed.

The only way I could see to solve this issue was to perform two separate queries: one with the MIN(), performed first, and another with the JOIN, performed on the results of the first query. This meant that for a table with n rows, I'd perform n^n queries, instead of n queries. That wasn't acceptable.

To work around the issue, I wrote a new script that modifies the database before any of these queries are ever performed. Each subnet is given its own "bucket" of ip values, and all values in that range map to that subnet. If a more specific (i.e., smaller) subnet overlaps a less specific (i.e., larger) subnet, then the more specific range is mapped only to the smaller subnet, and the larger subnet retains only the values from the less specific range. Now any given ip address falls into only one "bucket", and maps to only one subnet, which is its most specific match. I can JOIN on this match and never have to worry about the MIN() function.

Upvotes: 0

Related Questions