Mustafa Alqanbar
Mustafa Alqanbar

Reputation: 47

Joining tables containing comma delimited values

I have three excel sheet I push them into tables in SQL server and I need to join these table. However, I believe - as I have tried already - normal join wouldn't work. I have programming background but not that much with SQL.

Table1

ID  Data_column reference_number
1   some data   1528,ss-456
2   some data   9523
3   some data   ss-952
4   some data   null

Table2 

ID      Data_column
ss-456  some data
ss-952  some data


Table3 

ID      Data_column
1528    some data
9523    some data

In the case below How I will be able to join this raw on both table.

Table1

ID  Data_column reference_number
1   some data   1528,ss-456

Upvotes: 0

Views: 1756

Answers (6)

Paul Maxwell
Paul Maxwell

Reputation: 35563

You will require a function to divide the comma separated sting into rows. If you don't have access to thr inbuilt string_split() function (as of mssql 2017 with compatibility of 130) there are several to choose from here

 CREATE TABLE table1(
    ID               INTEGER  NOT NULL PRIMARY KEY 
   ,Data_column      VARCHAR(10) NOT NULL
   ,reference_number VARCHAR(11)
 );


 INSERT INTO table1(ID,Data_column,reference_number) VALUES
   (1,'t1somedata','1528,ss-456')
 , (2,'t1somedata','9523')
 , (3,'t1somedata','ss-952')
 , (4,'t1somedata',NULL);


 CREATE TABLE table2(
    ID           VARCHAR(6) NOT NULL PRIMARY KEY
   ,Data_column VARCHAR(10) NOT NULL
 );


 INSERT INTO table2(ID,Data_column) VALUES
  ('ss-456','t2somedata'),
  ('ss-952','t2somedata');



 CREATE TABLE table3(
    ID           VARCHAR(6) NOT NULL PRIMARY KEY
   ,Data_column VARCHAR(10) NOT NULL
 );


 INSERT INTO table3(ID,Data_column) VALUES 
 ('1528','t3somedata'),
 ('9523','t3somedata');

I have used this splitstring function, but you can use almost any of the many freely available, such as DelimitedSplit8k by Jeff Moden:

CREATE FUNCTION dbo.SplitStrings_Moden
(
   @List NVARCHAR(MAX),
   @Delimiter NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING AS
RETURN
  WITH E1(N)        AS ( SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 
                         UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 
                         UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),
       E2(N)        AS (SELECT 1 FROM E1 a, E1 b),
       E4(N)        AS (SELECT 1 FROM E2 a, E2 b),
       E42(N)       AS (SELECT 1 FROM E4 a, E2 b),
       cteTally(N)  AS (SELECT 0 UNION ALL SELECT TOP (DATALENGTH(ISNULL(@List,1))) 
                         ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E42),
       cteStart(N1) AS (SELECT t.N+1 FROM cteTally t
                         WHERE (SUBSTRING(@List,t.N,1) = @Delimiter OR t.N = 0))
  SELECT Item = SUBSTRING(@List, s.N1, ISNULL(NULLIF(CHARINDEX(@Delimiter,@List,s.N1),0)-s.N1,8000))
    FROM cteStart s;

This is what the data looks like using the splitstring function:

 select *
 from table1
 cross apply SplitStrings_Moden(reference_number,',')
ID Data_column reference_number Item
1 t1somedata 1528,ss-456 1528
1 t1somedata 1528,ss-456 ss-456
2 t1somedata 9523 9523
3 t1somedata ss-952 ss-952
4 t1somedata null null

and now joining to the other tables:

select
*
from (
select *
from table1
cross apply SplitStrings_Moden(reference_number,',')
) t1
left join table2 on t1.item = table2.id
left join table3 on t1.item = table3.id
where t1.item is not null
GO
ID Data_column reference_number Item ID Data_column ID Data_column
1 t1somedata 1528,ss-456 1528 null null 1528 t3somedata
1 t1somedata 1528,ss-456 ss-456 ss-456 t2somedata null null
2 t1somedata 9523 9523 null null 9523 t3somedata
3 t1somedata ss-952 ss-952 ss-952 t2somedata null null

db<>fiddle here

Upvotes: 0

is_oz
is_oz

Reputation: 973

declare @t1 as table(
     id int
    ,data_column varchar(20)
    ,reference_number varchar(20)
)

declare @t2 as table(
     id varchar(20)
    ,data_column varchar(20)
)

declare @t3 as table(
     id varchar(20)
    ,data_column varchar(20)
)

insert into @t1 values(1,'some data','1528,ss-456'),(2,'some data','9523'),(3,'some data','ss-952'),(4,'some data',null);

insert into @t2 values('ss-456','some data'),('ss-952','some data');

insert into @t3 values(1528,'some data'),(9523,'some data');

Quick solution

select * from @t1 t1
left outer join @t2 t2 on t1.reference_number like '%'+t2.id or t1.reference_number like t2.id+'%'
left outer join @t3 t3 on t1.reference_number like '%'+t3.id or t1.reference_number like t3.id+'%'

Result (left join):

id  data_column reference_number    id      data_column id  data_column
1   some data   1528,ss-456         ss-456  some data   1528    some data
2   some data   9523                NULL    NULL        9523    some data
3   some data   ss-952              ss-952  some data   NULL    NULL
4   some data   NULL                NULL    NULL        NULL    NULL

You can change 'left outer join' to 'inner join' for exact match.

Upvotes: 2

Salman Arshad
Salman Arshad

Reputation: 272006

Clumsy design, clumsy solution:

SELECT *
FROM Table1
INNER JOIN Table2 ON ',' + Table1.reference_number + ',' LIKE '%,' + Table2.ID + ',%'
INNER JOIN Table3 ON ',' + Table1.reference_number + ',' LIKE '%,' + Table3.ID + ',%' 

You must append leading and trailing commas to make sure that, for example, 1528,ss-456asdf does not match %ss-456%.

Upvotes: 1

ARr0w
ARr0w

Reputation: 1731

You can Implement and get desired result using Substring and charIndex functions on the reference_number.

I upvoted an answer of 'is_oz' since i used his ready made schema to test and build a query for you.

below is the final query i build after several tries i made here:

select * from abc
left join abc2 on abc2.id = case when charindex(',',abc.reference_number) > 0
                             then substring(abc.reference_number
                                       ,charindex(',',abc.reference_number)+1
                                       ,len(abc.reference_number)-(charindex(',',abc.reference_number)-1)
                                      ) 
                             else abc.reference_number
                             end
left join abc3 on abc3.id = case when charindex(',',abc.reference_number) > 0
                             then substring(abc.reference_number
                                       ,0
                                       ,(charindex(',',abc.reference_number))
                                      ) 
                             else abc.reference_number
                             end

As per your requirement as much as i understand, it is returning all the matched rows from 2 other tables but still i hope this fulfills all the requirements you seek in your question. :)

Upvotes: 0

Shushil Bohara
Shushil Bohara

Reputation: 5656

TRY THIS: If your reference_number is fixed and always stored IDs upto 2 only then you can go with the below approach

SELECT *
FROM(
    SELECT ID, 
        data_column, 
        CASE WHEN PATINDEX ( '%,%', reference_number) > 0 THEN 
            SUBSTRING(reference_number, PATINDEX ( '%,%', reference_number)+1, LEN(reference_number)) 
        ELSE reference_number END AS ref_col
    FROM @table1
    UNION
    SELECT ID, 
        data_column, 
        CASE WHEN PATINDEX ( '%,%', reference_number) > 0 THEN 
            SUBSTRING(reference_number, 0, PATINDEX ( '%,%', reference_number)) 
        END
    FROM @table1) t1
LEFT JOIN @table2 t2 ON t2.id = t1.ref_col
LEFT JOIN @table3 t3 ON t3.id = t1.ref_col
WHERE t1.ref_col IS NOT NULL

OUTPUT:

ID  data_column ref_col ID      Data_column ID      Data_column
1   some data   1528    NULL    NULL        1528    some data
1   some data   ss-456  ss-456  some data   NULL    NULL
2   some data   9523    NULL    NULL        9523    some data
3   some data   ss-952  ss-952  some data   NULL    NULL
4   some data   null    NULL    NULL        NULL    NULL    

Upvotes: 0

Jacek Wr&#243;bel
Jacek Wr&#243;bel

Reputation: 1222

I see two problems here. First is the inconsistent type of ID in table 2 and 3 and aggregation of referenced keys in table 1. Here is an example how to solve both problems. To split REFERENCE_NUMBER column I used STRING_SPLIT function.

Update: I added the solution which should work with SQL Server 2012.

I assumed that you wish to join data from table 1 with 2 or 3 depending in existence of this data. This is just my idea what you wanted to achive.

-- data preparing
declare @t1 as table(
     id int
    ,data_column varchar(20)
    ,reference_number varchar(20)
)

declare @t2 as table(
     id varchar(20)
    ,data_column varchar(20)
)

declare @t3 as table(
     id int
    ,data_column varchar(20)
)

insert into @t1 values(1,'some data','1528,ss-456'),(2,'some data','9523'),(3,'some data','ss-952'),(4,'some data',null);

insert into @t2 values('ss-456','some data'),('ss-952','some data');

insert into @t3 values(1528,'some data'),(9523,'some data');

-- Solution example version >= 2016
with base as (
select t1.id,t1.data_column,f1.value from @t1 t1 outer apply string_split(t1.reference_number,',') f1)
select b.id,b.data_column,b.value,t2.data_column from base b join @t2 t2 on b.value = t2.id
union all
select b.id,b.data_column,b.value,t3.data_column from base b join @t3 t3 on try_cast(b.value as int ) = t3.id
union all
select b.id,b.data_column,b.value,null from base b where b.value is null;

-- Solution for SQL Version < 2016
with base as (
select t1.id,t1.data_column,f1.value from @t1 t1 outer apply( 
    SELECT Split.a.value('.', 'NVARCHAR(MAX)') value
FROM
(
    SELECT CAST('<X>'+REPLACE(t1.reference_number, ',', '</X><X>')+'</X>' AS XML) AS String
) AS A
CROSS APPLY String.nodes('/X') AS Split(a)
) f1)
select b.id,b.data_column,b.value,t2.data_column from base b join @t2 t2 on b.value = t2.id
union all
select b.id,b.data_column,b.value,t3.data_column from base b join @t3 t3 on try_cast(b.value as int ) = t3.id
union all
select b.id,b.data_column,b.value,null from base b where b.value is null;

Upvotes: 0

Related Questions