Reputation: 3058

How to efficiently retrieve data in one to many relationships

I am running into an issue where I have a need to run a Query which should get some rows from a main table, and have an indicator if the key of the main table exists in a subtable (relation one to many).

The query might be something like this:

select a.index, (select count(1) from second_table b where a.index = b.index) 
from first_table a;

This way I would get the result I want (0 = no depending records in second_table, else there are), but I'm running a subquery for each record I get from the database. I need to get such an indicator for at least three similar tables, and the main query is already some inner join between at least two tables...

My question is if there is some really efficient way to handle this. I have thought of keeping record in a new column the "first_table", but the dbadmin don't allow triggers and keeping track of it by code is too risky.

What would be a nice approach to solve this?

The application of this query will be for two things:

Indicate that at least one row in second_table exists for a given row in first_table. It is to indicate it in a list. If no row in the second table exists, I won't turn on this indicator.
To search for all rows in first_table which have at least one row in second_table, or which don't have rows in the second table.

Another option I just found:

select a.index, b.index 
from first_table a 
left join (select distinct(index) as index from second_table) b on a.index = b.index

This way I will get null for b.index if it doesn' exist (display can finally be adapted, I'm concerned on query performance here).

The final objective of this question is to find a proper design approach for this kind of case. It happens often, a real application culd be a POS system to show all clients and have one icon in the list as an indicator wether the client has open orders.

Upvotes: 5

Answers (7)

MarmiK

Reputation: 5785

I am not expert in using case, but will recommend the join...

that works even if you are using three tables or more..

SELECT t1.ID,t2.name, t3.date
FROM  Table1 t1 
LEFT OUTER JOIN Table2 t2 ON t1.ID = t2.ID
LEFT OUTER JOIN Table3 t3 ON t2.ID = t3.ID
--WHERE t1.ID = @ProductID -- this is optional condition, if want specific ID details..

this will help you fetch the data from Normalized(BCNF) tables.. as they always categorize data with type of nature in separate tables..

I hope this will do...

Upvotes: 0

Robert Co

Reputation: 1715

Or you can avoid join altogether.

WITH comb AS (
SELECT index
     , 'N' as exist_ind
  FROM first_table
UNION ALL
SELECT DISTINCT 
       index
     , 'Y' as exist_ind
  FROM second_table
)
SELECT index
     , MAX(exist_ind) exist_ind
  FROM comb
 GROUP BY index

Upvotes: 1

jph

Reputation: 2233

Two ideas: one that doesn't involve changing your tables and one that does. First the one that uses your existing tables:

SELECT
  a.index,
  b.index IS NOT NULL,
  c.index IS NOT NULL
FROM
  a_table a
LEFT JOIN
  b_table b ON b.index = a.index
LEFT JOIN
  c_table c ON c.index = a.index
GROUP BY
  a.index, b.index, c.index

Worth noting that this query (and likely any that resemble it) will be greatly helped if b_table.index and c_table.index are either primary keys or are otherwise indexed.

Now the other idea. If you can, instead of inserting a row into b_table or c_table to indicate something about the corresponding row in a_table, indicate it directly on the a_table row. Add exists_in_b_table and exists_in_c_table columns to a_table. Whenever you insert a row into b_table, set a_table.exists_in_b_table = true for the corresponding row in a_table. Deletes are more work since in order to update the a_table row you have to check if there are any rows in b_table other than the one you just deleted with the same index. If deletes are infrequent, though, this could be acceptable.

Upvotes: 1

cosmos

Reputation: 2303

I am assuming that you can't change the table definitions, e.g. partitioning the columns.

Now, to get a good performance you need to take into account other tables which are getting joined to your main table.

It all depends on data demographics.

If the other joins will collapse the rows by high factor, you should consider doing a join between your first table and second table. This will allow the optimizer to pick best join order , i.e, first joining with other tables then the resulting rows joined with your second table gaining the performance.
Otherwise, you can take subquery approach (I'll suggest using exists, may be Mikhail's solution).
Also, you may consider creating a temporary table, if you need such queries more than once in same session.

Upvotes: 0

the_slk

Reputation: 2182

The application of this query will be for two things:

Indicate that at least one row in second_table exists for a given row in first_table. It is to indicate it in a list.

To search for all rows in first_table which have at least one row in second_table.

Here you go:

SELECT  a.index, 1 as c_check  -- 1: at least one row in second_table exists for a given row in first_table
FROM    first_table a
WHERE   EXISTS
        (
            SELECT  1
            FROM    second_table b
            WHERE   a.index = b.index
        );

Upvotes: 0

Mikhail

Reputation: 1560

Try using EXISTS, I suppose, for such case it might be better then joining tables. On my oracle db it's giving slightly better execution time then the sample query, but this may be db-specific.

SELECT first_table.ID, CASE WHEN EXISTS (SELECT * FROM second_table WHERE first_table.ID = second_table.ID) THEN 1 ELSE 0 END FROM first_table

Upvotes: 6

Nilesh Nikumbh

Reputation: 302

why not try this one

select a.index,count(b.[table id])  
from first_table a
left join second_table b
    on a.index = b.index
group by a.index

Upvotes: 2

How to efficiently retrieve data in one to many relationships

Answers (7)

Related Questions