John
John

Reputation: 63

Subquery to improve performance and maintainability

I have a query like this:

select Table1.column1 AS CODE, COUNT(DINSTINCT(Table2.column1 || '|' || Table2.column2)) AS COUNT
FROM   Table2
INNER JOIN Table3 ON Table3.referenceColumn = Table2.referenceColumn
INNER JOIN Table1 ON Table1.referenceColumn = Table2.referenceColumn
WHERE
....
AND Table1.column1 <> ''

The output of the query will be something like this, basically a CODE and the respective COUNT, for example:

CODE    COUNT
ref002   3
ref003   1

After this first query, I have a foreach that will iterate the result of the query above. Inside the foreach, for each result of the query above, I want to get some information that is available in Table3, basically, I want all the values in the Table3.column1 for each CODE (Table1.Column1). So inside the foreach I have another query to get the Table3.column for each iterated result:

select Table3.column1
FROM   Table3
INNER JOIN Table3 ON Table3.referenceColumn = Table2.referenceColumn
INNER JOIN Table1 ON Table1.referenceColumn = Table2.referenceColumn
WHERE .... 
AND Table1.column1 = (equal to a parameter (Table1.column1) that is available in each foreach iteration)

And like this, I can get all the Table3.column1 values for each Table1.column1 (for each CODE) of the first query.

Doubt

Both queries are almost the same. The only difference between the query before the foreach and the query inside the foreach is in the SELECT part and in the WHERE, basically the where just have an additional condition. So it doesn't seem very good in terms of performance and maintainability has the 2 queries because both queries are almost the same. So it should be possible to get all the necessary information in the first query instead of having a second query inside the foreach.

Do you know what is necessary to change in the first query to, besides return the CODE and COUNT, also return all values in the Table3.column1 for each CODE? So that is possible to remove the query inside the foreach and get everything that is needed only one query (the 1st query)? The necessary output for the first query should be something like:

CODE     COUNT     IDNUMBERS
ref002    3         ab00, cd00
ref003    1         ef00

Maybe a subquery in the select clause can solve this, but I'm not understanding how to properly use a subquery for this. Is its necessary to use the first query and just put the full second query as a subquery inside the select clause? Like:

select 
Table1.column1 AS CODE, 
COUNT(DINSTINCT(Table2.column1 || '|' || Table2.column2)) AS COUNT, 
(select Table3.column1 
FROM Table3 INNER JOIN Table3 ON Table3.referenceColumn = Table2.referenceColumn 
INNER JOIN Table1 ON Table1.referenceColumn = Table2.referenceColumn 
WHERE .... 
AND Table3.column1 = Table1.column1) AS IDNUMBERS 

FROM Table2 
INNER JOIN Table3 ON Table3.referenceColumn = Table2.referenceColumn 
INNER JOIN Table1 ON Table1.referenceColumn = Table2.referenceColumn 
WHERE .... 

Upvotes: 0

Views: 58

Answers (1)

Alex Poole
Alex Poole

Reputation: 191265

You might be looking for the collect() function, which gives you a nested table as part of your main result set.

Here's an example using default HR schema data:

select d.department_id,
  count(e.employee_id) as cnt,
  cast(collect(e.first_name) as sys.odcivarchar2list) as names
from departments d
left join employees e on e.department_id = d.department_id
group by d.department_id;

DEPARTMENT_ID        CNT NAMES                                                                           
------------- ---------- --------------------------------------------------------------------------------
           10          1 ODCIVARCHAR2LIST('Jennifer')                                                    
           20          2 ODCIVARCHAR2LIST('Michael', 'Pat')                                              
           30          6 ODCIVARCHAR2LIST('Den', 'Karen', 'Guy', 'Sigal', 'Shelli', 'Alexander')         
           40          1 ODCIVARCHAR2LIST('Susan')                                                       
...

You could also use acursor() expression to include a subquery for each result row, which is closer to your original suggestion:

select d.department_id,
  count(e.employee_id) as cnt,
  cursor(
    select e2.first_name
    from employees e2
    where e2.department_id = d.department_id
  ) as names
from departments d
left join employees e on e.department_id = d.department_id
group by d.department_id;

SQL Developer presents the results of that (when run as a script) as:

DEPARTMENT_ID        CNT NAMES                                                                           
------------- ---------- --------------------------------------------------------------------------------
           10          1 CURSOR STATEMENT : 3                                                            

CURSOR STATEMENT : 3

FIRST_NAME          
--------------------
Jennifer
           20          2 CURSOR STATEMENT : 3                                                            

CURSOR STATEMENT : 3

FIRST_NAME          
--------------------
Michael
Pat
           30          6 CURSOR STATEMENT : 3                                                            

CURSOR STATEMENT : 3

FIRST_NAME          
--------------------
Den
Alexander
Shelli
Sigal
Guy
Karen

6 rows selected. 

...

In this case as you're accessing the same table(s) in the outer query and subquery that might be less efficient; on the other hand you're pulling the same data blocks so it might not matter. It could be simpler for you to handle the ref cursor than a collection though. It somewhat depends on how you're processing the results now.


Your original suggestion is basically that query without the cursor keyword; but that will error because a scalar subquery expression must return a single value:

select d.department_id,
  count(e.employee_id) as cnt,
  (
    select e2.first_name
    from employees e2
    where e2.department_id = d.department_id
  ) as names
from departments d
left join employees e on e.department_id = d.department_id
group by d.department_id;

ORA-01427: single-row subquery returns more than one row

... unless I (very artificially) restrict it to departments I know in advance will only return one row in the subquery:

select d.department_id,
  count(e.employee_id) as cnt,
  (
    select e2.first_name
    from employees e2
    where e2.department_id = d.department_id
  ) as names
from departments d
left join employees e on e.department_id = d.department_id
where d.department_id in (10, 40)
group by d.department_id;

DEPARTMENT_ID        CNT NAMES                                                                           
------------- ---------- --------------------------------------------------------------------------------
           10          1 Jennifer                                                                        
           40          1 Susan                                                                           

Upvotes: 1

Related Questions