Reputation: 780
Supposing I have a table, mytable, as follows:
sampleID rs A1 A2
--------------------------------
1001 rs123 A C
1001 rs124 T C
1001 rs125 A T
1001 rs126 A C
1002 rs122 A C
1002 rs123 T C
1002 rs124 T C
1002 rs125 A C
And I would like to compare any 2 sampleID's where they have a common rs value to see if their A1 and/or A2 values match.
For example, take
SELECT sampleID as Sample1, rs as rs1, A1 as A1_1, A2 as A2_1 FROM mytable where sampleID = "1001"
SELECT sampleID as Sample2, rs as rs2, A1 as A1_2, A2 as A2_2 FROM mytable where sampleID = "1002"
How can I write a SELECT statement that takes the result of each SELECT above, joins at rs1 = rs2 and compares A1_1 with A1_2 and A2_1 with A2_2?
Upvotes: 1
Views: 53
Reputation: 780
For completeness I would like to include a slightly modified version of an answer that was posted before. In this version only the non matching A1/A2 columns are shown as per the WHERE clause
SELECT
A.rs1 as rs, A.Sample1_A1, A.Sample1_A2, B.Sample2_A1, B.Sample2_A2
from
(
SELECT sampleID as Sample1, rs as rs1, A1 as Sample1_A1, A2 as Sample1_A2
FROM mytable where sampleID = "1001"
)A left join
(
SELECT sampleID as Sample2, rs as rs2, A1 as Sample2_A1, A2 as Sample2_A2
FROM mytable where sampleID = "1002"
)B on A.rs1=B.rs2
where A.Sample1_A1 != B.Sample2_A1 or A.Sample1_A2 != B.Sample2_A2
Upvotes: 0
Reputation: 521399
I would use as self join here, to handle the comparisons:
SELECT
t1.sampleID,
t2.sampleID,
t1.rs,
t1.A1,
t2.A1,
(t1.A1 = t2.A1) AS A1_comp,
t1.A2,
t2.A2,
(t1.A2 = t2.A2) AS A2_comp
FROM mytable t1
INNER JOIN mytable t2
ON t1.sampleID < t2.sampleID
WHERE
t1.rs = t2.rs
ORDER BY
t1.sampleID,
t2.sampleID,
t1.rs;
The join condition requires that the sampleID
on the left side of the join is strictly less than the one of the right side. This ensures that we do not duplicate comparisons, or compare the same sample against itself. We take advantage of selecting boolean equalities for the A1
and A2
values, which MySQL's syntax allows. The aliases A1_comp
and A1_comp
will be 0 for a non match, and 1 for a match.
Upvotes: 1