Reputation: 2171
New to SQL and I understand joins tend to be faster than subqueries. I have the following table, and my current query gives me the results I require, but I can’t wrap my head around a similar query that uses self join instead, assuming it’s possible.
Table
id scheduled_id action_id
------------ ------------ ------------
1 1 1
2 1 2
3 1 3
4 2 1
5 2 2
6 3 1
Schema
create table ma (
id integer primary key,
scheduled_id integer,
action_id integer
);
insert into ma (
id,
scheduled_id,
action_id
)
values
(1, 1, 1),
(2, 1, 2),
(3, 1, 3),
(4, 2, 1),
(5, 2, 2),
(6, 3, 1);
Query
select * from ma where action_id = 3
union all
select * from ma where scheduled_id not in (
select scheduled_id from ma
where action_id = 3)
Result
id scheduled_id action_id
------------ ------------ ------------
3 1 3
4 2 1
5 2 2
6 3 1
My result should be all the rows that have the action_id value of 3 plus all the rows for those scheduled_ids that do not have an action_id value of 3.
The sqlfiddle may be found at http://sqlfiddle.com/#!5/0ba51/3.
Thank you.
Upvotes: 2
Views: 678
Reputation: 1026
How about this? Though not self join but faster than union
select * from ma
where action_id = 3 or scheduled_id not in (
select scheduled_id from ma
where action_id = 3
)
Upvotes: 1
Reputation: 477
The result you are looking for using a self join is:
SELECT DISTINCT t1.*
FROM ma t1
JOIN ma t2
ON t1.SCHEDULED_ID <> t2.SCHEDULED_ID --Satisfies 2nd query
WHERE t2.ACTION_ID = 3 --Satisfies 2nd query
OR t1.ACTION_ID = 3 --Satisfies 1st query
ORDER BY t1.ID
Upvotes: 1
Reputation: 447
SELECT m1.*
FROM ma m1
INNER JOIN
(
SELECT *
FROM ma m2
WHERE m2.action_id = 3
) AS matbl
WHERE m1.action_id = 3
OR matbl.scheduled_id<>m1.scheduled_id
Hope it will help.
Upvotes: 1
Reputation: 2403
This code only works if your action_id's always go 1,2,3,4,etc and never skip over 3. I just wanted to provide an alternate answer in case adding in the concept of max(action_id) could be useful to you.
select ma.id
, ma.scheduled_id
, ma.action_id
, ma_max.max_action_id
from (
select scheduled_id
, max(action_id) as max_action_id
from ma
group by scheduled_id
) ma_max
join ma
on ma_max.scheduled_id = ma.scheduled_id
where (action_id = 3 or max_action_id < 3)
It's almost certainly not going to perform as well as the other answers that are using "EXISTS". I just like how the complexity of the logic gets reduced down to essentially one easy to read line in where (action_id = 3 or max_action_id < 3)
.
Upvotes: 0
Reputation: 1269583
My result should be all the scheduled_ids that have the value of 3 plus all the scheduled_ids and action_ids for those scheduled_ids that do not have a value of 3.
This is not what your query does. The query that does that is:
select ma.*
from ma
where exists (select 1
from ma ma2
where ma2.scheduled_id = ma.scheduled_id and
ma2.action_id = 3
);
Although you could do this with a self-join, it is tricky, because the query can result in duplicates. I recommend exists
or in
for the logic.
Upvotes: 0
Reputation: 222432
I don’t think a JOIN is really what you need here. I would use the following query, which avoids UNION :
SELECT m.*
FROM ma m
WHERE
m.action_id = 3
OR NOT EXISTS (
SELECT 1
FROM ma m1
WHERE
m1.scheduled_id = m.scheduled_id
AND m1.action_id = 3
)
When it comes to checking for the existence (or absence) of something, NOT EXISTS with a correlated subquery is usually the most relevant and efficient approach.
Upvotes: 1