David Austin
David Austin

Reputation: 95

How could I create combinations of removed dashes in SQL Server string?

I've been looking for a table-valued function for SQL Server that can do the following:

Input string: A-B-C-D (or any length string with dash-separated characters, like XX-W2-ZZZ-AAA-777-888)

Output strings (all combinations of removed dashes):

ABCD, A-BCD, AB-CD, ABC-D, A-B-CD, AB-C-D, A-B-C-D

I've noticed the pattern follows a binary counter with respect to which dash should be removed to generate the combinations. In the example above, you could remove the dashes associated with the 0 positions of 000, 001, 010, 011, 100, 101, 110, and 111. However I don't see how to do this in SQL Server. Have any of you tackled this challenge before? Thank you!

Upvotes: 0

Views: 223

Answers (3)

Solomon Rutzky
Solomon Rutzky

Reputation: 48826

I was trying to find something that did not involve multiple loops and would be more set-based. A CROSS JOIN is good for combinations as that is what a Cross Join / Cartesian-product is. But I still had to resort to Dynamic SQL due to the variable nature of how many dashes there might be. "But you can't do Dynamic SQL in a function!" I keep hearing that, yet I am not entirely convinced due to the following SQLCLR TVF. It constructs a query that, for the example input of A-B-C-D, looks like:

SELECT CONCAT(tab1.part, tab2.part, tab3.part, tab4.part) AS [Combinations]
FROM (SELECT N'A') tab1(part)
CROSS JOIN (SELECT N'B' UNION ALL SELECT N'-B') tab2(part)
CROSS JOIN (SELECT N'C' UNION ALL SELECT N'-C') tab3(part)
CROSS JOIN (SELECT N'D' UNION ALL SELECT N'-D') tab4(part)

This dynamic construction makes use of the pattern of combinations being:

FirstElement + {Cartesian Product of no-dash and preceding-dash versions of remaining elements}

The .Net / C# code:

using System;
using System.Collections;
using System.Collections.Generic;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using System.Text;
using Microsoft.SqlServer.Server;

public class TVF
{
    public static void ReturnCombo(object Combination, out SqlString Combo)
    {
        Combo = (string)Combination;
    }

    [Microsoft.SqlServer.Server.SqlFunction(FillRowMethodName = "ReturnCombo",
          TableDefinition = "Combo NVARCHAR(500)", DataAccess = DataAccessKind.Read)]
    public static IEnumerable GetCombinations([SqlFacet(MaxSize = -1)] SqlString DashedThingy)
    {
        List<string> _Combinations = new List<string>();
        int _PartNum = 0;
        StringBuilder _FirstPart = new StringBuilder("SELECT CONCAT(tab1.part");
        StringBuilder _SecondPart = new StringBuilder(") AS [Combinations]\n");

        foreach (string _Part in DashedThingy.Value.Split(new char[1] { '-' }))
        {
            _PartNum++;

            if (_PartNum == 1)
            {
                _SecondPart.Append("FROM (SELECT N'").Append(_Part).AppendLine("') tab1(part)");
            }
            else
            {
                _FirstPart.Append(", tab").Append(_PartNum).Append(".part");
                _SecondPart.Append("CROSS JOIN (SELECT N'").Append(_Part);
                _SecondPart.Append("' UNION ALL SELECT N'-").Append(_Part);
                _SecondPart.Append("') tab").Append(_PartNum).AppendLine("(part)");
            }
        }

        SqlConnection _Connection = new SqlConnection("Context Connection = true;");
        SqlCommand _Command = new SqlCommand();
        _Command.Connection = _Connection;
        _Command.CommandType = CommandType.Text;
        _Command.CommandText = _FirstPart.ToString() + _SecondPart.ToString();
        SqlDataReader _Reader = null;

        try
        {
            _Connection.Open();
            _Reader = _Command.ExecuteReader();

            while (_Reader.Read())
            {
                _Combinations.Add(_Reader.GetString(0));
            }
        }
        catch
        {
            throw;
        }
        finally
        {
            if (_Reader != null && !_Reader.IsClosed)
            {
                _Reader.Close();
            }
            if (_Connection != null && _Connection.State != ConnectionState.Closed)
            {
                _Connection.Close();
            }
        }

        return _Combinations;
    }
}

And the SQL to create it is:

CREATE ASSEMBLY [GetCombinations]
    AUTHORIZATION [dbo]
    FROM 'C:\path\to\DLL'
GO

CREATE FUNCTION [dbo].[GetCombinations]
(@DashedThingy NVARCHAR (MAX))
RETURNS TABLE ([Combo] NVARCHAR (500) NULL)
AS EXTERNAL NAME [GetCombinations].[TVF].[GetCombinations];

This is fully dynamic so it can handle any number of dashes. It finds the missing combination that is not in the example output (A-BC-D) and handles the other example easily:

SELECT * FROM dbo.GetCombinations('XX-W2-ZZZ-AAA-777-888');

Returns:

Combo
---------
XXW2ZZZAAA777888
XXW2-ZZZAAA777888
XXW2ZZZ-AAA777888
XXW2-ZZZ-AAA777888
XXW2ZZZAAA-777888
XXW2-ZZZAAA-777888
XXW2ZZZ-AAA-777888
XXW2-ZZZ-AAA-777888
XX-W2ZZZAAA777888
XX-W2-ZZZAAA777888
XX-W2ZZZ-AAA777888
XX-W2-ZZZ-AAA777888
XX-W2ZZZAAA-777888
XX-W2-ZZZAAA-777888
XX-W2ZZZ-AAA-777888
XX-W2-ZZZ-AAA-777888
XXW2ZZZAAA777-888
XXW2-ZZZAAA777-888
XXW2ZZZ-AAA777-888
XXW2-ZZZ-AAA777-888
XXW2ZZZAAA-777-888
XXW2-ZZZAAA-777-888
XXW2ZZZ-AAA-777-888
XXW2-ZZZ-AAA-777-888
XX-W2ZZZAAA777-888
XX-W2-ZZZAAA777-888
XX-W2ZZZ-AAA777-888
XX-W2-ZZZ-AAA777-888
XX-W2ZZZAAA-777-888
XX-W2-ZZZAAA-777-888
XX-W2ZZZ-AAA-777-888
XX-W2-ZZZ-AAA-777-888

Upvotes: 1

David Austin
David Austin

Reputation: 95

This is a method I found to work inside MSSQL and accept any count of dashes. Main table-value function:

-- =============================================
-- Description: Outputs all possible combinations of dash and no dash in a string
-- Test:  Select * From dbo.[fnDashCombinations]('A')
-- Test:  Select * From dbo.[fnDashCombinations]('A-B')
-- Test:  Select * From dbo.[fnDashCombinations]('AB-CD-EF-11-22')
-- =============================================
ALTER FUNCTION [dbo].[fnDashCombinations] 
(   
    @InputText VARCHAR(50)
)
RETURNS @output TABLE(ModelName VARCHAR(50)) 
BEGIN 
    --Get the count of dashes
    DECLARE @DashCount INT
    SET @DashCount = (select len(@InputText) - len(replace(@InputText, '-', '')))

    --Count through the dashes 
    DECLARE @OuterIterator INT
    DECLARE @InnerIterator INT
    DECLARE @OuterIteratorLimit INT
    DECLARE @DashesAsBinary VARCHAR(50)
    DECLARE @DashLocation INT
    DECLARE @TextTemp VARCHAR(50)

    SET @OuterIteratorLimit = POWER(2, @DashCount) - 1

    SET @OuterIterator = 0
    WHILE @OuterIterator < @OuterIteratorLimit
    BEGIN
        --Convert  the dash count into the equivalent binary string
        SET @TextTemp = @InputText
        SET @DashesAsBinary = dbo.fnBinaryString(@OuterIterator, @DashCount)
        SET @DashLocation = 0
        SET @InnerIterator = 0

        --Loop thru @DashesAsBinary and remove the dash if there's a zero 
        WHILE @InnerIterator < @DashCount
        BEGIN
            SET @DashLocation = CHARINDEX('-', @TextTemp, @DashLocation + 1)
            IF SUBSTRING(@DashesAsBinary, @InnerIterator + 1, 1) = '0'
            BEGIN   --Replace with underscore for now to keep string length constant            
                SET @TextTemp = dbo.fnReplaceCharAtPos(@TextTemp, @DashLocation, '_')
            END

            SET @InnerIterator = @InnerIterator + 1
        END

        INSERT INTO @output SELECT REPLACE(@TextTemp, '_', '') --Finally remove extra chars
        SET @OuterIterator = @OuterIterator + 1
    END
    RETURN
END

Additional scalar function that is called to convert the iterator into a binary string, like 3 = '011' (found most of this code elsewhere):

ALTER FUNCTION [dbo].[fnBinaryString] (@IncomingNumber int, @MinChars int)
RETURNS varchar(200)
as
BEGIN

    DECLARE @BinNumber  VARCHAR(200)
    SET @BinNumber = ''

    WHILE @IncomingNumber <> 0
    BEGIN
        SET @BinNumber = SUBSTRING('0123456789', (@IncomingNumber % 2) + 1, 1) + @BinNumber
        SET @IncomingNumber = @IncomingNumber / 2
    END

    if (LEN(@BinNumber) < @MinChars)
        SET @BinNumber = REPLICATE('0', @MinChars - LEN(@BinNumber)) + @BinNumber

    RETURN @BinNumber

END

Additional scalar function that is called to replace a character at a position, used in this case to replace a dash with an underscore (found this code elsewhere):

ALTER FUNCTION [dbo].[fnReplaceCharAtPos](@Str varchar(8000),@Pos int, @Chr char(1))
RETURNS varchar(8000) AS  
BEGIN 
    declare @Res varchar(8000)
    set @Res=left(@Str,@Pos-1) + @Chr  + right(@Str,len(@Str)-@Pos)
    return @Res
END

Upvotes: 0

Kevin Cook
Kevin Cook

Reputation: 1932

For this first section, I'm going to split the string into table / columns using XML.

DECLARE @Test TABLE
( ID INT,
  NAME VARCHAR(MAX)
 )

 INSERT INTO @Test 
 VALUES( 1, 'XX-W2-ZZZ-AAA-777-888' )

 DECLARE @ColSplit TABLE
 (
    Id int,
    Col1 VARCHAR(MAX),
    Col2 VARCHAR(MAX),
    Col3 VARCHAR(MAX),
    Col4 VARCHAR(MAX),
    Col5 VARCHAR(MAX),
    Col6 VARCHAR(MAX),
    Col7 VARCHAR(MAX),
    Col8 VARCHAR(MAX)
 )

;WITH FormSplitXML
AS
(
    Select Id, Name,
    CONVERT(XML,'<r><n>' + REPLACE(Name, '-', '</n><n>') + '</n></r>') AS X
    FROM @Test
)

INSERT INTO @ColSplit
SELECT  Id,  
    i.value('n[1]','varchar(100)') AS Col1,
    i.value('n[2]','varchar(100)') AS Col2,
    i.value('n[3]','varchar(100)') AS Col3,
    i.value('n[4]','varchar(100)') AS Col4,
    i.value('n[5]','varchar(100)') AS Col5,
    i.value('n[6]','varchar(100)') AS Col6,
    i.value('n[7]','varchar(100)') AS Col7,
    i.value('n[8]','varchar(100)') AS Col8
FROM FormSplitXML Spt 
CROSS APPLY Spt.X.nodes('/r') x(i) 

This forms a table with this output:

SELECT * FROM @ColSplit

Id  Col1    Col2    Col3    Col4    Col5    Col6    Col7    Col8
1   XX  W2  ZZZ AAA 777 888 NULL    NULL

Next I am going to make a bit array table of all of the possible bit combinations:

DECLARE @BitTable TABLE
(
    v int,
    V1 BIT,
    V2 BIT,
    V3 BIT,
    V4 BIT,
    V5 BIT,
    V6 BIT,
    V7 BIT,
    V8 BIT
)

Declare @t table (v integer not null primary key, check(v >= 0));

;WITH
a AS (SELECT 1 AS i UNION ALL SELECT 1),
b AS (SELECT 1 AS i FROM a AS x, a AS y),
c AS (SELECT 1 AS i FROM b AS x, b AS y),
d AS (SELECT 1 AS i FROM c AS x, c AS y),
e AS (SELECT 1 AS i FROM d AS x, d AS y),
f AS (SELECT 1 AS i FROM e AS x, e AS y),
numbers AS 
(
    SELECT TOP(255)
    ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS number
    FROM f
)


Insert into @t 
SELECT number FROM numbers


INSERT INTO @BitTable
( v, V1, V2, V3, V4, V5, V6, V7, V8 )
SELECT
    v,
    CONVERT(BIT, v & 1) AS V1,
    CONVERT(BIT, v & 2) AS V2,
    CONVERT(BIT, v & 4) AS V3,
    CONVERT(BIT, v & 8) AS V4,
    CONVERT(BIT, v & 16) AS V5,
    CONVERT(BIT, v & 32) AS V6,
    CONVERT(BIT, v & 64) AS V7,
    CONVERT(BIT, v & 128) AS V8
FROM
@t

Here is the output of the bittable (only the first 10 rows, it keeps going to 255)

SELECT * FROM @BitTable
v   V1  V2  V3  V4  V5  V6  V7  V8
1   1   0   0   0   0   0   0   0
2   0   1   0   0   0   0   0   0
3   1   1   0   0   0   0   0   0
4   0   0   1   0   0   0   0   0
5   1   0   1   0   0   0   0   0
6   0   1   1   0   0   0   0   0
7   1   1   1   0   0   0   0   0
8   0   0   0   1   0   0   0   0
9   1   0   0   1   0   0   0   0
10  0   1   0   1   0   0   0   0

Now using the bit table and the split columns, I am going to put a string together of all of the possible combinations:

SELECT
    bt.*,
    t.*,
    CASE WHEN bt.V1 = 1 THEN ISNULL(t.Col1,'') + '-' ELSE t.Col1 END +
    CASE WHEN bt.V2 = 1 THEN ISNULL(t.Col2,'') + '-' ELSE ISNULL(t.Col2,'') END +
    CASE WHEN bt.V3 = 1 THEN ISNULL(t.Col3,'') + '-' ELSE ISNULL(t.Col3,'') END +
    CASE WHEN bt.V4 = 1 THEN ISNULL(t.Col4,'') + '-' ELSE ISNULL(t.Col4,'') END +
    CASE WHEN bt.V5 = 1 THEN ISNULL(t.Col5,'') + '-' ELSE ISNULL(t.Col5,'') END +
    CASE WHEN bt.V6 = 1 THEN ISNULL(t.Col6,'') + '-' ELSE ISNULL(t.Col6,'') END +
    CASE WHEN bt.V7 = 1 THEN ISNULL(t.Col7,'') + '-' ELSE ISNULL(t.Col7,'') END +
    CASE WHEN bt.V8 = 1 THEN ISNULL(t.Col8,'') + '-' ELSE ISNULL(t.Col8,'') END
FROM @BitTable bt
CROSS JOIN @ColSplit t

Here is the output (snipped to 10 rows, it goes to 255):

v   V1  V2  V3  V4  V5  V6  V7  V8  Id  Col1    Col2    Col3    Col4    Col5    Col6    Col7    Col8    (No column name)
1   1   0   0   0   0   0   0   0   1   XX  W2  ZZZ AAA 777 888 NULL    NULL    XX-W2ZZZAAA777888
2   0   1   0   0   0   0   0   0   1   XX  W2  ZZZ AAA 777 888 NULL    NULL    XXW2-ZZZAAA777888
3   1   1   0   0   0   0   0   0   1   XX  W2  ZZZ AAA 777 888 NULL    NULL    XX-W2-ZZZAAA777888
4   0   0   1   0   0   0   0   0   1   XX  W2  ZZZ AAA 777 888 NULL    NULL    XXW2ZZZ-AAA777888
5   1   0   1   0   0   0   0   0   1   XX  W2  ZZZ AAA 777 888 NULL    NULL    XX-W2ZZZ-AAA777888
6   0   1   1   0   0   0   0   0   1   XX  W2  ZZZ AAA 777 888 NULL    NULL    XXW2-ZZZ-AAA777888
7   1   1   1   0   0   0   0   0   1   XX  W2  ZZZ AAA 777 888 NULL    NULL    XX-W2-ZZZ-AAA777888
8   0   0   0   1   0   0   0   0   1   XX  W2  ZZZ AAA 777 888 NULL    NULL    XXW2ZZZAAA-777888
9   1   0   0   1   0   0   0   0   1   XX  W2  ZZZ AAA 777 888 NULL    NULL    XX-W2ZZZAAA-777888
10  0   1   0   1   0   0   0   0   1   XX  W2  ZZZ AAA 777 888 NULL    NULL    XXW2-ZZZAAA-777888

Here you go, hope it helps.

Upvotes: 1

Related Questions