Reputation: 95
I've been looking for a table-valued function for SQL Server that can do the following:
Input string: A-B-C-D
(or any length string with dash-separated characters, like XX-W2-ZZZ-AAA-777-888
)
Output strings (all combinations of removed dashes):
ABCD, A-BCD, AB-CD, ABC-D, A-B-CD, AB-C-D, A-B-C-D
I've noticed the pattern follows a binary counter with respect to which dash should be removed to generate the combinations. In the example above, you could remove the dashes associated with the 0 positions of 000, 001, 010, 011, 100, 101, 110, and 111. However I don't see how to do this in SQL Server. Have any of you tackled this challenge before? Thank you!
Upvotes: 0
Views: 223
Reputation: 48826
I was trying to find something that did not involve multiple loops and would be more set-based. A CROSS JOIN
is good for combinations as that is what a Cross Join / Cartesian-product is. But I still had to resort to Dynamic SQL due to the variable nature of how many dashes there might be. "But you can't do Dynamic SQL in a function!" I keep hearing that, yet I am not entirely convinced due to the following SQLCLR TVF. It constructs a query that, for the example input of A-B-C-D
, looks like:
SELECT CONCAT(tab1.part, tab2.part, tab3.part, tab4.part) AS [Combinations]
FROM (SELECT N'A') tab1(part)
CROSS JOIN (SELECT N'B' UNION ALL SELECT N'-B') tab2(part)
CROSS JOIN (SELECT N'C' UNION ALL SELECT N'-C') tab3(part)
CROSS JOIN (SELECT N'D' UNION ALL SELECT N'-D') tab4(part)
This dynamic construction makes use of the pattern of combinations being:
FirstElement + {Cartesian Product of no-dash and preceding-dash versions of remaining elements}
The .Net / C# code:
using System;
using System.Collections;
using System.Collections.Generic;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using System.Text;
using Microsoft.SqlServer.Server;
public class TVF
{
public static void ReturnCombo(object Combination, out SqlString Combo)
{
Combo = (string)Combination;
}
[Microsoft.SqlServer.Server.SqlFunction(FillRowMethodName = "ReturnCombo",
TableDefinition = "Combo NVARCHAR(500)", DataAccess = DataAccessKind.Read)]
public static IEnumerable GetCombinations([SqlFacet(MaxSize = -1)] SqlString DashedThingy)
{
List<string> _Combinations = new List<string>();
int _PartNum = 0;
StringBuilder _FirstPart = new StringBuilder("SELECT CONCAT(tab1.part");
StringBuilder _SecondPart = new StringBuilder(") AS [Combinations]\n");
foreach (string _Part in DashedThingy.Value.Split(new char[1] { '-' }))
{
_PartNum++;
if (_PartNum == 1)
{
_SecondPart.Append("FROM (SELECT N'").Append(_Part).AppendLine("') tab1(part)");
}
else
{
_FirstPart.Append(", tab").Append(_PartNum).Append(".part");
_SecondPart.Append("CROSS JOIN (SELECT N'").Append(_Part);
_SecondPart.Append("' UNION ALL SELECT N'-").Append(_Part);
_SecondPart.Append("') tab").Append(_PartNum).AppendLine("(part)");
}
}
SqlConnection _Connection = new SqlConnection("Context Connection = true;");
SqlCommand _Command = new SqlCommand();
_Command.Connection = _Connection;
_Command.CommandType = CommandType.Text;
_Command.CommandText = _FirstPart.ToString() + _SecondPart.ToString();
SqlDataReader _Reader = null;
try
{
_Connection.Open();
_Reader = _Command.ExecuteReader();
while (_Reader.Read())
{
_Combinations.Add(_Reader.GetString(0));
}
}
catch
{
throw;
}
finally
{
if (_Reader != null && !_Reader.IsClosed)
{
_Reader.Close();
}
if (_Connection != null && _Connection.State != ConnectionState.Closed)
{
_Connection.Close();
}
}
return _Combinations;
}
}
And the SQL to create it is:
CREATE ASSEMBLY [GetCombinations]
AUTHORIZATION [dbo]
FROM 'C:\path\to\DLL'
GO
CREATE FUNCTION [dbo].[GetCombinations]
(@DashedThingy NVARCHAR (MAX))
RETURNS TABLE ([Combo] NVARCHAR (500) NULL)
AS EXTERNAL NAME [GetCombinations].[TVF].[GetCombinations];
This is fully dynamic so it can handle any number of dashes. It finds the missing combination that is not in the example output (A-BC-D) and handles the other example easily:
SELECT * FROM dbo.GetCombinations('XX-W2-ZZZ-AAA-777-888');
Returns:
Combo
---------
XXW2ZZZAAA777888
XXW2-ZZZAAA777888
XXW2ZZZ-AAA777888
XXW2-ZZZ-AAA777888
XXW2ZZZAAA-777888
XXW2-ZZZAAA-777888
XXW2ZZZ-AAA-777888
XXW2-ZZZ-AAA-777888
XX-W2ZZZAAA777888
XX-W2-ZZZAAA777888
XX-W2ZZZ-AAA777888
XX-W2-ZZZ-AAA777888
XX-W2ZZZAAA-777888
XX-W2-ZZZAAA-777888
XX-W2ZZZ-AAA-777888
XX-W2-ZZZ-AAA-777888
XXW2ZZZAAA777-888
XXW2-ZZZAAA777-888
XXW2ZZZ-AAA777-888
XXW2-ZZZ-AAA777-888
XXW2ZZZAAA-777-888
XXW2-ZZZAAA-777-888
XXW2ZZZ-AAA-777-888
XXW2-ZZZ-AAA-777-888
XX-W2ZZZAAA777-888
XX-W2-ZZZAAA777-888
XX-W2ZZZ-AAA777-888
XX-W2-ZZZ-AAA777-888
XX-W2ZZZAAA-777-888
XX-W2-ZZZAAA-777-888
XX-W2ZZZ-AAA-777-888
XX-W2-ZZZ-AAA-777-888
Upvotes: 1
Reputation: 95
This is a method I found to work inside MSSQL and accept any count of dashes. Main table-value function:
-- =============================================
-- Description: Outputs all possible combinations of dash and no dash in a string
-- Test: Select * From dbo.[fnDashCombinations]('A')
-- Test: Select * From dbo.[fnDashCombinations]('A-B')
-- Test: Select * From dbo.[fnDashCombinations]('AB-CD-EF-11-22')
-- =============================================
ALTER FUNCTION [dbo].[fnDashCombinations]
(
@InputText VARCHAR(50)
)
RETURNS @output TABLE(ModelName VARCHAR(50))
BEGIN
--Get the count of dashes
DECLARE @DashCount INT
SET @DashCount = (select len(@InputText) - len(replace(@InputText, '-', '')))
--Count through the dashes
DECLARE @OuterIterator INT
DECLARE @InnerIterator INT
DECLARE @OuterIteratorLimit INT
DECLARE @DashesAsBinary VARCHAR(50)
DECLARE @DashLocation INT
DECLARE @TextTemp VARCHAR(50)
SET @OuterIteratorLimit = POWER(2, @DashCount) - 1
SET @OuterIterator = 0
WHILE @OuterIterator < @OuterIteratorLimit
BEGIN
--Convert the dash count into the equivalent binary string
SET @TextTemp = @InputText
SET @DashesAsBinary = dbo.fnBinaryString(@OuterIterator, @DashCount)
SET @DashLocation = 0
SET @InnerIterator = 0
--Loop thru @DashesAsBinary and remove the dash if there's a zero
WHILE @InnerIterator < @DashCount
BEGIN
SET @DashLocation = CHARINDEX('-', @TextTemp, @DashLocation + 1)
IF SUBSTRING(@DashesAsBinary, @InnerIterator + 1, 1) = '0'
BEGIN --Replace with underscore for now to keep string length constant
SET @TextTemp = dbo.fnReplaceCharAtPos(@TextTemp, @DashLocation, '_')
END
SET @InnerIterator = @InnerIterator + 1
END
INSERT INTO @output SELECT REPLACE(@TextTemp, '_', '') --Finally remove extra chars
SET @OuterIterator = @OuterIterator + 1
END
RETURN
END
Additional scalar function that is called to convert the iterator into a binary string, like 3 = '011' (found most of this code elsewhere):
ALTER FUNCTION [dbo].[fnBinaryString] (@IncomingNumber int, @MinChars int)
RETURNS varchar(200)
as
BEGIN
DECLARE @BinNumber VARCHAR(200)
SET @BinNumber = ''
WHILE @IncomingNumber <> 0
BEGIN
SET @BinNumber = SUBSTRING('0123456789', (@IncomingNumber % 2) + 1, 1) + @BinNumber
SET @IncomingNumber = @IncomingNumber / 2
END
if (LEN(@BinNumber) < @MinChars)
SET @BinNumber = REPLICATE('0', @MinChars - LEN(@BinNumber)) + @BinNumber
RETURN @BinNumber
END
Additional scalar function that is called to replace a character at a position, used in this case to replace a dash with an underscore (found this code elsewhere):
ALTER FUNCTION [dbo].[fnReplaceCharAtPos](@Str varchar(8000),@Pos int, @Chr char(1))
RETURNS varchar(8000) AS
BEGIN
declare @Res varchar(8000)
set @Res=left(@Str,@Pos-1) + @Chr + right(@Str,len(@Str)-@Pos)
return @Res
END
Upvotes: 0
Reputation: 1932
For this first section, I'm going to split the string into table / columns using XML.
DECLARE @Test TABLE
( ID INT,
NAME VARCHAR(MAX)
)
INSERT INTO @Test
VALUES( 1, 'XX-W2-ZZZ-AAA-777-888' )
DECLARE @ColSplit TABLE
(
Id int,
Col1 VARCHAR(MAX),
Col2 VARCHAR(MAX),
Col3 VARCHAR(MAX),
Col4 VARCHAR(MAX),
Col5 VARCHAR(MAX),
Col6 VARCHAR(MAX),
Col7 VARCHAR(MAX),
Col8 VARCHAR(MAX)
)
;WITH FormSplitXML
AS
(
Select Id, Name,
CONVERT(XML,'<r><n>' + REPLACE(Name, '-', '</n><n>') + '</n></r>') AS X
FROM @Test
)
INSERT INTO @ColSplit
SELECT Id,
i.value('n[1]','varchar(100)') AS Col1,
i.value('n[2]','varchar(100)') AS Col2,
i.value('n[3]','varchar(100)') AS Col3,
i.value('n[4]','varchar(100)') AS Col4,
i.value('n[5]','varchar(100)') AS Col5,
i.value('n[6]','varchar(100)') AS Col6,
i.value('n[7]','varchar(100)') AS Col7,
i.value('n[8]','varchar(100)') AS Col8
FROM FormSplitXML Spt
CROSS APPLY Spt.X.nodes('/r') x(i)
This forms a table with this output:
SELECT * FROM @ColSplit
Id Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8
1 XX W2 ZZZ AAA 777 888 NULL NULL
Next I am going to make a bit array table of all of the possible bit combinations:
DECLARE @BitTable TABLE
(
v int,
V1 BIT,
V2 BIT,
V3 BIT,
V4 BIT,
V5 BIT,
V6 BIT,
V7 BIT,
V8 BIT
)
Declare @t table (v integer not null primary key, check(v >= 0));
;WITH
a AS (SELECT 1 AS i UNION ALL SELECT 1),
b AS (SELECT 1 AS i FROM a AS x, a AS y),
c AS (SELECT 1 AS i FROM b AS x, b AS y),
d AS (SELECT 1 AS i FROM c AS x, c AS y),
e AS (SELECT 1 AS i FROM d AS x, d AS y),
f AS (SELECT 1 AS i FROM e AS x, e AS y),
numbers AS
(
SELECT TOP(255)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS number
FROM f
)
Insert into @t
SELECT number FROM numbers
INSERT INTO @BitTable
( v, V1, V2, V3, V4, V5, V6, V7, V8 )
SELECT
v,
CONVERT(BIT, v & 1) AS V1,
CONVERT(BIT, v & 2) AS V2,
CONVERT(BIT, v & 4) AS V3,
CONVERT(BIT, v & 8) AS V4,
CONVERT(BIT, v & 16) AS V5,
CONVERT(BIT, v & 32) AS V6,
CONVERT(BIT, v & 64) AS V7,
CONVERT(BIT, v & 128) AS V8
FROM
@t
Here is the output of the bittable (only the first 10 rows, it keeps going to 255)
SELECT * FROM @BitTable
v V1 V2 V3 V4 V5 V6 V7 V8
1 1 0 0 0 0 0 0 0
2 0 1 0 0 0 0 0 0
3 1 1 0 0 0 0 0 0
4 0 0 1 0 0 0 0 0
5 1 0 1 0 0 0 0 0
6 0 1 1 0 0 0 0 0
7 1 1 1 0 0 0 0 0
8 0 0 0 1 0 0 0 0
9 1 0 0 1 0 0 0 0
10 0 1 0 1 0 0 0 0
Now using the bit table and the split columns, I am going to put a string together of all of the possible combinations:
SELECT
bt.*,
t.*,
CASE WHEN bt.V1 = 1 THEN ISNULL(t.Col1,'') + '-' ELSE t.Col1 END +
CASE WHEN bt.V2 = 1 THEN ISNULL(t.Col2,'') + '-' ELSE ISNULL(t.Col2,'') END +
CASE WHEN bt.V3 = 1 THEN ISNULL(t.Col3,'') + '-' ELSE ISNULL(t.Col3,'') END +
CASE WHEN bt.V4 = 1 THEN ISNULL(t.Col4,'') + '-' ELSE ISNULL(t.Col4,'') END +
CASE WHEN bt.V5 = 1 THEN ISNULL(t.Col5,'') + '-' ELSE ISNULL(t.Col5,'') END +
CASE WHEN bt.V6 = 1 THEN ISNULL(t.Col6,'') + '-' ELSE ISNULL(t.Col6,'') END +
CASE WHEN bt.V7 = 1 THEN ISNULL(t.Col7,'') + '-' ELSE ISNULL(t.Col7,'') END +
CASE WHEN bt.V8 = 1 THEN ISNULL(t.Col8,'') + '-' ELSE ISNULL(t.Col8,'') END
FROM @BitTable bt
CROSS JOIN @ColSplit t
Here is the output (snipped to 10 rows, it goes to 255):
v V1 V2 V3 V4 V5 V6 V7 V8 Id Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 (No column name)
1 1 0 0 0 0 0 0 0 1 XX W2 ZZZ AAA 777 888 NULL NULL XX-W2ZZZAAA777888
2 0 1 0 0 0 0 0 0 1 XX W2 ZZZ AAA 777 888 NULL NULL XXW2-ZZZAAA777888
3 1 1 0 0 0 0 0 0 1 XX W2 ZZZ AAA 777 888 NULL NULL XX-W2-ZZZAAA777888
4 0 0 1 0 0 0 0 0 1 XX W2 ZZZ AAA 777 888 NULL NULL XXW2ZZZ-AAA777888
5 1 0 1 0 0 0 0 0 1 XX W2 ZZZ AAA 777 888 NULL NULL XX-W2ZZZ-AAA777888
6 0 1 1 0 0 0 0 0 1 XX W2 ZZZ AAA 777 888 NULL NULL XXW2-ZZZ-AAA777888
7 1 1 1 0 0 0 0 0 1 XX W2 ZZZ AAA 777 888 NULL NULL XX-W2-ZZZ-AAA777888
8 0 0 0 1 0 0 0 0 1 XX W2 ZZZ AAA 777 888 NULL NULL XXW2ZZZAAA-777888
9 1 0 0 1 0 0 0 0 1 XX W2 ZZZ AAA 777 888 NULL NULL XX-W2ZZZAAA-777888
10 0 1 0 1 0 0 0 0 1 XX W2 ZZZ AAA 777 888 NULL NULL XXW2-ZZZAAA-777888
Here you go, hope it helps.
Upvotes: 1