Reputation: 141
I have this sql query and this takes very long time to finish. How can i speed this up?
t_inter_specises_interaction has 60k rows and t_pathway has 100k. uniprot_id_1,uniprot_id_2,uniprot_id are varchar types.
In this query, i want to select uniprot_id_1 and uniprot_id_2 which are both present in t_pathway :
select distinct uniprot_id_1,uniprot_id_2 from t_intra_species_interaction
where uniprot_id_1 in (select uniprot_id from t_pathway) and
uniprot_id_2 in (select uniprot_id from t_pathway)
In this one, i want to select uniprot_id which is present in the set uniprot_ids returned by the first query above.
select distinct uniprot_id,id from t_pathway as t
where uniprot_id in
(
select distinct uniprot_id_2 from t_intra_species_interaction
where uniprot_id_1 in (select uniprot_id from t_pathway) and
uniprot_id_2 in (select uniprot_id from t_pathway)
union
select distinct uniprot_id_1 from t_intra_species_interaction
where uniprot_id_1 in (select uniprot_id from t_pathway) and
uniprot_id_2 in (select uniprot_id from t_pathway)
)
Thanks.
Upvotes: 1
Views: 361
Reputation: 3684
The subqueries are identical, therefore they can be merged into one, and moved to a join
SELECT DISTINCT i.uniprot_id_1, i.uniprot_id_2
FROM t_intra_species_interaction i
INNER JOIN t_pathway p ON p.uniprot_id IN (i.uniprot_id_1, i.uniprot_id_2)
Second query
It would have be better to open a new question referring to this, but looking at my previous query it should be easy to see that to get your second answer, you only need to get the column from t_pathway instead of t_intra_species_interaction
SELECT DISTINCT p.uniprot_id, p.id
FROM t_intra_species_interaction i
INNER JOIN t_pathway p ON p.uniprot_id IN (i.uniprot_id_1, i.uniprot_id_2)
Upvotes: 2
Reputation: 10618
How about this:
select distinct uniprot_id_1, uniprot_id_2
from t_intra_species_interaction
where exists (select uniprot_id from t_pathway
where uniprot_id_1 = uniprot_id) and
exists (select uniprot_id from t_pathway
where uniprot_id_2 = uniprot_id)
Upvotes: 1
Reputation: 33381
Try this:
select distinct uniprot_id_1,uniprot_id_2
from t_intra_species_interaction I
join t_pathway P1
on I.uniprot_id_1 = P1.uniprot_id
join t_pathway P2
on I.uniprot_id_2 = P2.uniprot_id
or
select distinct uniprot_id_1,uniprot_id_2
from t_intra_species_interaction I
where exists (select 1 from t_pathway where uniprot_id = I.uniprot_id_1)
and exists (select 1 from t_pathway where uniprot_id = I.uniprot_id_2)
Upvotes: 2
Reputation: 19843
There are a general guildline:
Create three indexes, one on t_pathway.uniport_id
and one on t_intra_species_interaction.uniport_id1
and another on t_intra_species_interaction.uniport_id2
By this way all the data that you need is in your indexes, and it should be fast
Also converting your in clauses to Left join as Tomas mentioned in his answer
Upvotes: 1
Reputation: 2857
You may want to use INNER JOIN:
select distinct uniprot_id_1,uniprot_id_2 from t_intra_species_interaction i
inner join t_pathway p1
on p1.uniprod_id = t.uniprot_id_1
inner join t_pathway p2
on p2.uniprod_id = t_uniprot_id_2
Upvotes: 2