Reputation: 517
I have a dataset that has a structure similar to this (greatly simplified for expository purposes):
CREATE TABLE FOO (
CUI CHAR(8),
SAB VARCHAR(40),
CODE VARCHAR(50),
KEY X_CUI (CUI)
);
There is no primary key: there are multiple rows with the same CUI value:
C0000039 MSH D015060
C0000039 NDFRT N0000007747
C0000039 LNC LP15542-1
C0074393 RCD da5..
C0074393 RXNORM 36437
C0074393 SNOMEDCT_US 96211006
I want to find all unique CUI values that do not have a SNOMEDCT_US SAB. In the above example C0000039 would be in the result set, but C0074393 would not.
Unfortunately this exceeds my rudimentary SQL skills: could someone suggest an approach in SQL?
The full dataset contains 11,633,065 rows with 2,973,458 unique CUIs.
I am using MySQL 5.6.19.
Upvotes: 0
Views: 105
Reputation: 366
Although the solution with IN subquery works, from performance perspective it may not be the best option. Esp. if there is a lot of data to scan.
I believe that in MySQL 5.6 it is the same as in previous versions. The optimiser translates this IN subquery to a correlated subquery. And the drawback of correlated subquery - it is executed for every row. This may results in slowdown.
Depending on your needs and situation (no of rows, type of data, no of reads etc), this could be the better alternative in terms of query execution time:
SELECT CUI
FROM FOO
GROUP BY CUI
HAVING SUM(1) = SUM(IF(('SNOMEDCT_US') IN (SAB), 0, 1))
Also good to know more solutions to the same problem to have a choice :-)
Upvotes: 0
Reputation: 25753
Try this way:
select distinct cui
from foo
where CUI not in (select CUI
from foo
where SAB = 'SNOMEDCT_US')
Upvotes: 3
Reputation: 1605
Try this query :
SELECT DISTINCT CUI
FROM FOO
WHERE SAB != 'SNOMEDCT_US'
AND CUI NOT IN (SELECT DISTINCT CUI FROM FOO WHERE SAB = 'SNOMEDCT_US')
Upvotes: 3
Reputation: 21757
Try this:
select distinct cui
from tbl
where cui not in
(
select cui from tbl
where sab = 'SNOMEDCT_US'
)
The subquery returns all values of cui
where sab
= 'SNOMEDCT_US'. Since these values should not be part of the final answer, we filter them out using NOT IN
.
Upvotes: 3
Reputation: 172448
Try this:
select distinct cui
from foo where CUI not in (select distinct CUI from foo
where SAB = 'SNOMEDCT_US')
Upvotes: 2