Jacobian
Jacobian

Reputation: 10802

Optimizing a simple query with join and subquery

I have two tables - object_72194_ and object_72197_ respectively:

| attr_72195_ |     | attr_72198_ | attr_72199_  |
| 2013-07-31  |     |   a         | 2013-07-31   |
| 2013-07-30  |     |   b         | 2013-07-31   |
| 2013-07-29  |     |   c         | 2013-07-30   |
| 2013-07-28  |     |   d         | 2013-07-29   |

For each row in the first table I want to get the value of the field attr_72198_ from the second table where attr_72199_ less or equal to attr_72195_. So, in this case the result will look like this:

|attr_72195_  | attr_72196_  |
|2013-07-31   | a            |
|2013-07-30   | c            |
|2013-07-29   | d            |
|2013-07-28   | NULL         | 

I want to get value per row. Now my working query looks like this:

SELECT f1.attr_72195_, t.attr_72198_ AS attr_72196_ 
FROM object_72194_ f1 
LEFT OUTER JOIN ( 
 SELECT id, attr_72198_, attr_72199_ FROM object_72197_ t 
) AS t ON t.attr_72199_ <= f1.attr_72195_ 
WHERE ( f1.id_obj = 72194 ) AND (t.attr_72199_ = (
 SELECT MAX(attr_72199_) FROM object_72197_ t 
 WHERE attr_72199_ <= f1.attr_72195_
) OR t.attr_72199_ IS NULL

) ORDER BY f1.id_order DESC

It works as expected. But it does not seem to be quite optimal because of the subquery in the last WHERE block. One programmer advised me to use one more join with grouping instead, but I just do not know how. Thank you!

EDIT: Removed unnecessary ordering inside subqueries and checked both queries (mine and with joins instead of a subquery) and got interesting result. EXPLAIN returned five rows for the version with a subquery and four rows for the version with joins (id column). In "rows" column I got 13 rows for the subquery version and 18 rows in total for join version. So, have to check against large data to decide what version to use.

EDIT: Oh, this query with joins turned out to be incorrect, because it groups results by the column

EDIT: The question is still open. The problem appears when there are duplicates in the second table. As a result, both queries return as many rows as in the second table. But I need just a value per row. Just as I showed from the very first in the example with values "a", "b", "c" and "d".

EDIT: Finally, I did it. I added grouping by unique field in the first table and returned the previous grouping. So the query now looks like this:

SELECT f1.attr_72195_, f2.attr_72198_ AS attr_72196_ 
FROM object_72194_ f1
INNER JOIN 
(
 SELECT f1.attr_72195_, MAX(f2.attr_72199_) AS attr_72199_
 FROM object_72194_ f1 
 LEFT OUTER JOIN object_72197_ f2 ON f1.attr_72195_ >= f2.attr_72199_
 GROUP BY f1.attr_72195_
) o
ON f1.attr_72195_ = o.attr_72195_
LEFT OUTER JOIN object_72197_ f2 ON f2.attr_72199_ = o.attr_72199_
GROUP BY f1.id, attr_72195_ ORDER BY f1.id_order DESC

Simple and elegant.

Upvotes: 3

Views: 137

Answers (1)

Kickstart
Kickstart

Reputation: 21513

Avoiding the correlated sub queries (not tested):-

SELECT f1.attr_72195_, MIN(f2.attr_72198_)
FROM object_72194_ f1
INNER JOIN 
(
    SELECT f1.attr_72195_, MAX(f2.attr_72199_) As Max_attr_72199_
    FROM object_72194_ f1 
    LEFT OUTER JOIN object_72197_ f2
    ON f1.attr_72195_ >= f2.attr_72199_
    GROUP BY f1.attr_72195_
) Sub1
ON f1.attr_72195_ = Sub1.attr_72195_
LEFT OUTER JOIN object_72197_ f2
ON f2.attr_72199_ = Sub1.Max_attr_72199_
GROUP BY f1.attr_72195_

Do a LEFT JOIN between the 2 tables, and get the max date from the 2nd table which is less than or equal to the first table. Inner join the results of that back to the 1st table and left join to the 2nd table. Not sure which value of attr_72198_ you want when the dates are duplicates so I have just used the min function to get the smallest one.

EDIT

Try this which should cope with duplicates on the first table.

SELECT f1.attr_72195_, f2.attr_72198_
FROM object_72194_ f1
INNER JOIN 
(
    SELECT f1.attr_72195_, MAX(f2.attr_72199_) As Max_attr_72199_
    FROM object_72194_ f1 
    LEFT OUTER JOIN object_72197_ f2
    ON f1.attr_72195_ >= f2.attr_72199_
    GROUP BY f1.attr_72195_
) Sub1
ON f1.attr_72195_ = Sub1.attr_72195_
LEFT OUTER JOIN 
(
    SELECT attr_72199_, MIN(attr_72198_) AS attr_72198_
    FROM object_72197_ 
    GROUP BY attr_72199_
) f2
ON f2.attr_72199_ = Sub1.Max_attr_72199_

Upvotes: 2

Related Questions