rickyProgrammer
rickyProgrammer

Reputation: 1167

Split String Value in SQL for Flexible Filtering

I have a function where in user can assign multiple categories (food, non food etc) to a certain Tenant. See sample Data Table

Table: tblSales

    date        tenant    sales   category
    1/1/2015    tenant1   1000    Food,Non-Food,Kiosk 
    1/1/2015    tenant2   2000    Food
    1/1/2015    tenant3   1000    Non-Food,Kiosk

The system should be able to load record when the user selected any of the categories listed in Category Column.

For example, User selected categories: Non-Food,Kiosk. Expected result should be:

    date        tenant    sales   category
    1/1/2015    tenant1   1000    Food,Non-Food,Kiosk 
    1/1/2015    tenant3   1000    Non-Food,Kiosk

Since, Non-Food and Kiosk is seen in Tenants 1 and 3.

So, what I think, the process should be a string manipulation first on the value of Category column, splitting each word delimited by comma. I have code which does not work correctly

@Category nvarchar(500)  = 'Non-Food,Kiosk' --User selected

SELECT  date,tenant,sales,category
FROM  tblSales
WHERE (category  in (SELECT val FROM dbo.split (@Category, @delimeter)))

That does not seem to work because the one it is splitting is the User Selected Categories and not the value of the data itself. I tried this

 @Category nvarchar(500)  = 'Non-Food,Kiosk' --User selected

    SELECT  date,tenant,sales,category
    FROM  tblSales
    WHERE ((SELECT val FROM dbo.split (category, @delimeter)) in (SELECT val FROM dbo.split (@Category, @delimeter)))

But it resulted to this error

Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.

Upvotes: 1

Views: 3803

Answers (4)

Adriano Repetti
Adriano Repetti

Reputation: 67090

In addition to Tim's answer (he is absolutely right about CSV fields in databases!) please note that SQL Server 2016 introduced STRING_SPLIT function. For a single category it's as simple as:

SELECT
  date
 ,tenant
 ,sales
 ,category
FROM tblSales
WHERE @Category IN (SELECT value FROM STRING_SPLIT(category, ','))

For a comma delimited list of categories you have to use it twice together with EXISTS:

WHERE EXISTS
(
 SELECT *  
 FROM STRING_SPLIT(category, ',')  
 WHERE value IN (SELECT value FROM STRING_SPLIT(@Category, ','))
)

If you're using an older SQL Server version you may write your own STRING_SPLIT function, take a look to T-SQL split string. You can use that function with the same syntax as above (please note I wrote code here and it's untested so you may need some fixes).

Note about performance: from QP you can check how sub-queries will be executed, from a naive point of view I'd say CTE, temp-tables and sub-queries have roughly same performance (in this simple case) but if this code is performance critical you'd better perform some benchmark (with real data and a real-world access scenario).

Upvotes: 1

Eralper
Eralper

Reputation: 6612

You can try following SQL Select statement where I used my user defined SQL function for split string task

declare @Category nvarchar(500)  = 'Non-Food,Kiosk'
declare @cnt int = (select COUNT(*) from dbo.SPLIT(@Category,','))
;with cte as (
    select 
        t.*, COUNT(*) over (partition by tenant) cnt
    from dbo.SPLIT(@Category,',') u
    inner join (
        select
            tblSales.*, c.val
        from tblSales
        cross apply dbo.SPLIT(tblSales.category,',') c
    ) t on u.val = t.val
)
select distinct tenant from cte where cnt = @cnt

enter image description here

Upvotes: 0

Unnikrishnan R
Unnikrishnan R

Reputation: 5031

Try with the below code .

  1. Create a function to split delemited strings.

    CREATE FUNCTION SplitWords
    (    
      @Input NVARCHAR(MAX),
      @Character CHAR(1)
    )
      RETURNS @Output TABLE (
      Item NVARCHAR(1000)
     )
     AS
     BEGIN
      DECLARE @StartIndex INT, @EndIndex INT
    
      SET @StartIndex = 1
      IF SUBSTRING(@Input, LEN(@Input) - 1, LEN(@Input)) <> @Character
      BEGIN
            SET @Input = @Input + @Character
      END
    
      WHILE CHARINDEX(@Character, @Input) > 0
      BEGIN
            SET @EndIndex = CHARINDEX(@Character, @Input)
    
            INSERT INTO @Output(Item)
            SELECT SUBSTRING(@Input, @StartIndex, @EndIndex - 1)
    
            SET @Input = SUBSTRING(@Input, @EndIndex + 1, LEN(@Input))
      END
    
      RETURN
     END
     GO
    
  2. create a input tabl inside your prcedure /script and keep the split data in that. here your input is @Category

    DECLARE @input TABLE (item VARCHAR(50))
    INSERT INTO @input
    SELECT Item
    FROM  [dbo].SplitWords (@Category, ',')  
    
  3. make a join using like operator with your actual table

    SELECT DISTINCT  a.date,
         a.tenant,
         a.sales,
         a.category
     FROM tblSales s
       JOIN @input a
         ON category LIKE '%'+item+'%'
    

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521409

In general, it is bad practice to store CSV data into a database column, because, as you are currently seeing, it renders many of the advantages a database has not usable.

However, I think you might be able to get away with just using LIKE. Assuming the user selected the categories Non-Food and Kiosk, you could try the following query:

SELECT date,
       tenant,
       sales,
       category
FROM tblSales
WHERE category LIKE 'Non-Food' OR
      category LIKE 'Kiosk'

Upvotes: 1

Related Questions