Reputation: 3
I have two tables (They are bigger with more of 200,000 rows, this is just an example)
Table1:temptable
id |Chromosome | Start | End | Reference | Alternative | QUAL | MQ
........................................................................
NULL chr1 12334 12335 A TT 10 20
NULL chr1 12334 12335 C TT 5 3
NULL chr2 123 123 A T 1 2
NULL chr2 34 34 A T 11 60
NULL chr3 12 12 A T 6 NULL
Table2:TableVariants2
id | Chromosome | Start | End | Reference | Alternative
.............................................................
1 chr1 12334 12335 A TT
2 chr1 12334 12335 C TT
3 chr2 123 123 A T
4 chr2 34 34 A T
5 chr3 12 12 A T
I want to merge them based on equal Chromosome, Start, End, Reference and Alternative and obtain:
id |Chromosome | Start | End | Reference | Alternative | QUAL | MQ
........................................................................
1 chr1 12334 12335 A TT 10 20
2 chr1 12334 12335 C TT 5 3
3 chr2 123 123 A T 1 2
4 chr2 34 34 A T 11 60
5 chr3 12 12 A T 6 NULL
I have tried the following but it is too slow for big tables and I need something faster
SELECT B.Maxid, A.Chromosome, A.Start, A.End, A.Reference, A.Alternative , A.QUAL,A.MQ
FROM temptable A
INNER JOIN ( SELECT `Chromosome`, `Start`, `End`, `Reference`,`Alternative`, MAX(id) AS Maxid FROM TableVariants2 GROUP BY `Chromosome`, `Start`, `End`, `Reference`,`Alternative`) B
ON A.Chromosome=B.Chromosome AND A.Start=B.Start AND A.End=B.End AND A.Reference=B.Reference AND A.Alternative=B.Alternative
There is any alternative to the INNER JOIN in order to speed up the query?
Upvotes: 0
Views: 1053
Reputation: 142298
INDEX(Chromosome, Start, End, Reference, Alternative, id)
-- in the same order as the GROUP BY
.
Upvotes: 1
Reputation: 393
You can follow a few things while writing a query:
1. Analyze your data.
2. Make sure you have created necessary indexes.
3. In Select statement write only required column, avoid unnecessary columns.
4. Do not write subquery or co-related query until and unless is required.
Let's see if you follow these steps could help. But Inner Join is the basic join which cannot be replaced if you have to join two tables.
Upvotes: 2