user10096621
user10096621

Reputation: 235

Finding all occurences of a column value that do not exist in other table in SQL

My problem is as follows:

I have two tables: the first table has two columns (person_id and account_ids columns). Account_ids column has comma separated account IDs present:

Person_ID    Account_ids
-----------------------------------------------------------------
  123        000000grVA6AM,000016ILMhAO, 000000ihi4MAQ, 000000TF5MAQQ,000000TF5MAHZ
  124        000000frVA6AM,000016ILMhAO
  125        000000frVA6BC,000024ILMhJZ, 000000frXC6A,000024YTMhA

Second table has details about the person_id

Person_ID   Account_ID      Account_Name
----------------------------------------
  123       000000grVA6AM     Name1
  123       000016ILMhAO      Name2
  123       000000ihi4MAQ     Name3
  123       000000TF5MAQQ     Name4
  123       000000TF5MAHZ     Name5
  124       000016ILMhAO      Name2
  124       000000TF5MAHZ     Name5
  124       000000frVA6AM     Name6
  124       000024ILMhAO      Name7
  124       000000frVA7XZ     Name8
  125       000000frVA6BC     Name9 
  125       000024ILMhJZ      Name10
  125       000000frXC6A      Name11
  125       000024YTMhA       Name12
  125       000024IXThJY      Name13

I want to find all account_ids for each person_id that do not occur in Table 1 for that Person_ID. For example, from the above example the answer should be:

Person_ID   Account_ID      Account_Name
-----------------------------------------

   124      000000TF5MAHZ     Name5
   124      000024ILMhAO      Name7
   124      000000frVA7XZ     Name8
   125      000024IXThJY      Name13

How do I do that?

I was trying to apply the not like clause but then I did not understand how to split comma-separated values in Table 1. Please help!

Thanks!

UPDATE:

Giving Temp Table creation query so that it's easier to test:

CREATE TEMP table persons
( person_id int, account_id varchar(20), account_name varchar(10))

insert into persons values
 (123,       '000000grVA6AM',     'Name1');
 insert into persons values
  (123,       '000016ILMhAO',      'Name2');
  insert into persons values
  (123,       '000000ihi4MAQ',     'Name3');
  insert into persons values
  (123,       '000000TF5MAQQ',     'Name4');
  insert into persons values
  (123,       '000000TF5MAHZ',     'Name5');
  insert into persons values
  (124,      '000016ILMhAO',      'Name2');
  insert into persons values
  (124,       '000000TF5MAHZ',     'Name5');
  insert into persons values
  (124,      '000000frVA6AM',     'Name6');
  insert into persons values
  (124,       '000024ILMhAO',      'Name7');
  insert into persons values
  (124,       '000000frVA7XZ',     'Name8');
  insert into persons values
  (125,       '000000frVA6BC',     'Name9');
  insert into persons values 
  (125,       '000024ILMhJZ',      'Name10');
  insert into persons values
  (125,       '000000frXC6A',      'Name11');
  insert into persons values
  (125,       '000024YTMhA',       'Name12';
  insert into persons values
  (125,       '000024IXThJY'      'Name13');

For Accounts table:

CREATE TEMP table accounts
(person_id int, account_ids varchar(100))

insert into accounts values
(123,        '000000grVA6AM,000016ILMhAO, 000000ihi4MAQ, 000000TF5MAQQ,000000TF5MAHZ');
insert into accounts values
  (124,        '000000frVA6AM,000016ILMhAO');
insert into accounts values
  (125,        '000000frVA6BC,000024ILMhJZ, 000000frXC6A,000024YTMhA');

Upvotes: 0

Views: 465

Answers (2)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522386

We can try:

SELECT p.Person_ID, p.Account_ID, p.Account_Name
FROM persons p
WHERE NOT EXISTS (SELECT 1 FROM accounts a
                  WHERE ',' || a.Account_ids || ',' LIKE '%,' || p.AccountID || ',%');

Please note that storing your accounts as CSV data in a single column is not optimal database design, because such data is not normalized. In order to write the query you wanted, we had to do string olympics to check for the presence of each account in the CSV string. Had those accounts all been on separate records, our task would have been easier.

Edit:

In case your CSV account data really has spaces in it, then compare using a replace which removes them first:

WHERE NOT EXISTS (SELECT 1 FROM accounts a
                  WHERE ',' || REPLACE(a.Account_ids, ' ', '')  || ','
                        LIKE '%,' || p.AccountID || ',%');

Upvotes: 3

AlwaysLearning
AlwaysLearning

Reputation: 8829

If you're on SQL Server 2016 or better you can use the STRING_SPLIT() function to separate the CSV values for independent tests, e.g.:

select b.*
from (
  select Person_ID, Account_ID
  from dbo.Table2
  except
  select Person_ID, trim(Value)
  from dbo.Table1
  cross apply string_split(Account_ids, N',')
) a
join dbo.Table2 b on b.Person_ID=a.Person_ID and b.Account_ID=a.Account_ID

Note the trim(Value) which discards any space characters around the Account IDs in your Account_ids column, which requires SQL Server 2017 or better, you can use ltrim(rtrim(Value)) on earlier versions.

Upvotes: 0

Related Questions