Nate Cook
Nate Cook

Reputation: 8585

How to avoid repeating a lengthy CORRELATED subquery?

Here is the code to help you understand my question:

create table con ( content_id number);
create table mat ( material_id number, content_id number, resolution number, file_location varchar2(50), file_size number);
create table con_groups (content_group_id number, content_id number);

insert into con values (99);
insert into mat values (1, 99, 7, 'C:\foo.jpg', 1024);
insert into mat values (2, 99, 2, '\\server\xyz.mov', 350000);
insert into mat values (3, 99, 5, '\\server2\xyz.wav', 175000);
insert into con values (100);
insert into mat values (4, 100, 5, 'C:\bar.png', 2048);
insert into mat values (5, 100, 3, '\\server\xyz.mov', 27400);
insert into mat values (6, 100, 7, '\\server2\xyz.wav', 400);

insert into con_groups values (10, 99);
insert into con_groups values (10, 100);

SELECT m.material_id, 
       m.content_id,
       (SELECT max(file_location) keep (dense_rank first order by resolution desc)                  
          FROM mat
         WHERE mat.content_id = m.content_id
        /* AND ...
           AND ...
           AND ... */) special_mat_file_location,
       (SELECT max(file_size) keep (dense_rank first order by resolution desc)
          FROM mat
         WHERE mat.content_id = m.content_id
        /* AND ...
           AND ...
           AND ... */) special_mat_file_size
  FROM mat m
 WHERE m.material_id IN (select material_id
                           from mat
                     inner join con on con.content_id = mat.content_id
                     inner join con_groups on con_groups.content_id = con.content_id
                          where con_groups.content_group_id = 10);

I put the commented ANDs to highlight that this is a simplified example; the subquery in my real query is more complex with more criteria.

My problem is: I want to avoid repeating all of the criteria in the subquery for both columns (file_location and file_size) because the criteria is exactly the same. I would gladly use Common Table Expressions (i.e. subquery factoring using the WITH clause) but I can't because of the "WHERE mat.content_id = m.content_id" in the subquery, which makes it a correlated subquery. It is my understanding that one cannot factor correlated subqueries using the WITH clause. For that same reason I also cannot put this subquery as an inline view (aka derived table) in the FROM clause.

How can I include the criteria once and inject more than one column into the resultset with a correlated subquery?

Upvotes: 1

Views: 4498

Answers (2)

OMG Ponies
OMG Ponies

Reputation: 332601

Use Subquery Factoring (that's what Oracle calls it - it's called a Common Table Expression in SQL Server). Supported 9i+:

WITH file AS (
  SELECT t.content_id,
         MAX(t.file_location) keep (DENSE_RANK t.first ORDER BY t.resolution DESC) 'fileLocation',
         MAX(t.file_size) keep (DENSE_RANK t.first ORDER BY t.resolution DESC) 'fileSize'
    FROM mat t
GROUP BY t.content_id)
SELECT m.material_id, 
       m.content_id,
       f.fileLocation,
       f.fileSize
  FROM mat m
  JOIN file f ON f.content_id = m.content_id

It's intended for inline view re-use. You generalize the view, and define specific filteration for differing instances in the JOIN clause. You need to expose columns for joining in the SELECT clause - see the content_id as an example.

A correlated subquery means it can be re-written as a JOIN - the correlation is the JOIN criteria.

You can define multiple views in Subquery Factoring - if you provided more detail I could tailor the answer better.

Upvotes: 2

Nate Cook
Nate Cook

Reputation: 8585

I think I got it. I was able to just grab the primary key of the "special" material and wrap the main query in another layer, like so:

         SELECT x.material_id,
                x.content_id,
                sp_m.file_location,
                sp_m.file_size
           FROM (SELECT m.material_id, 
                        m.content_id,
                        (SELECT max(material_id) keep (dense_rank first order by resolution desc)                  
                           FROM mat
                          WHERE mat.content_id = m.content_id
                         /* AND ...
                            AND ...
                            AND ... */) special_mat_primary_key
                   FROM mat m
                  WHERE m.material_id IN (select material_id
                                            from mat
                                      inner join con on con.content_id = mat.content_id
                                      inner join con_groups on con_groups.content_id = con.content_id
                                           where con_groups.content_group_id = 10)) x
LEFT OUTER JOIN mat sp_m ON sp_m.material_id = x.special_mat_primary_key;

Let me know what you think.

Upvotes: 0

Related Questions