Michael Co
Michael Co

Reputation: 1

How to generate random alphanumeric unique characters with specified length

Problem is described below:

  1. Generate a unique alphanumeric characters.
  2. Length of characters should be 32.
  3. Unique numbers may be seeded in the current time to help in the uniqueness of the generated numbers.
  4. Alphabet characters must come from this pool: abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ

Sample Output: 445rpxlKYPkj1pg4q8nAy7Ab91zxZ8v1

I can do this using Java, but will greatly appreciate if you could help me do this on MS SQL or T-SQL.

Upvotes: 0

Views: 5562

Answers (2)

shawnt00
shawnt00

Reputation: 17915

I made this generic enough to handle any pool of characters and any output length. The core idea is to take a random sequence of bytes and use a base conversion algorithm to convert a long number into a new representation then translated to a string using your desired characters as its "digits".

For your specific scenario we need about 183 bits, or log2(52) x 32, to get to your desired length. Using newid() will generate the unique bit sequence but it will only do so 128 bits at a time and a series of values is simply concatenated until there are enough. Then having a value to operate on, the main loop is essentially the same long division we learned from elementary school. The intermediate calculations are kept in place in the varbinary array and the loop continues only until enough output characters are obtained. Each iteration determines another low order digit in the new base and this can terminate early since they won't change. The algorithm can't guarantee any global uniqueness if the output doesn't consume at least all of one newid(), so make sure log2(len(pool)) x output length is at least 128.

The target base, which is ultimately the length of the character pool, can't be more than 256. I hard-coded a limitation by setting the 128-byte maximum length of @e. For the question @e only needs to be 32 bytes long and it could be adjusted upward or downward as necessary or just defined as varbinary(max). If you need something more truly random you could find another source for the entropy bits like crypt_gen_random(). Since uniqueness appears to be the primary concern this answer fits that requirement. And by the way, repeating characters in the pool will naturally open the door for collisions.

This is fast and generic and it can be easily wrapped up in a function. And a more robust implementation would handle these extra checks.

declare @characterPool varchar(256) =
    'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
declare @outputLength int = 32;

declare @n int = 0; /* counter */
declare @numLoops int = ceiling(log(len(@characterPool)) / log(2) * @outputLength / 128)
declare @e varbinary(128) = 0x; /* entropy */

while @n < @numLoops
begin
    set @e = cast(newid() as binary(16)); set @n += 1;
end

declare @b int; /* byte */
declare @d int; /* dividend */
declare @out varchar(128) = '';

declare @outputBase int = len(@characterPool);
declare @entropyBytes int = len(@e);

declare @m int = 0;
while @m < @outputLength
begin
    set @b = 0; set @d = 0; set @n = 0;
    while @n < @entropyBytes /* big-endian */
    begin
        set @b = (@b - @d * @outputBase) * 256 + cast(substring(@e, @n + 1, 1) as int);
        set @d = @b / @outputBase;
        set @e = cast(stuff(@e, @n + 1, 1, cast(@d as binary(1))) as varbinary(128));
        set @n += 1;
    end
    set @out = substring(@characterPool, @b - @d * @outputBase + 1, 1) + @out;
    set @m += 1;
end

select @out as "UniqueString"

http://rextester.com/EYAK79470

As one simple test of the algorithm you could just assign a known value in hexadecimal format and confirm that the output (using 012345678ABCDEF as the character pool) is the same hexadecimal value. In the same way this obviously works with base64, binary and octal.

Update: The main loop can be made faster by not having to iterate over more bytes than necessary. I don't know how crypt_gen_random() compares to newid() in terms of speed or CPU usage so this change might not even be a net positive so I'll just note it as an alternative to explore. You will want to keep the bytes from newid on the little end and attach the rest to the front.

declare @e varbinary(1024) = cast(newid() as binary(16));
declare @padBytes int = ceiling(log(len(@characterPool)) / log(2) * @outputLength) - 128;
if @padBytes > 0 set @e = crypt_gen_random(@padBytes) + @e; /* big end plus little end */

Upvotes: 1

Felix Pamittan
Felix Pamittan

Reputation: 31879

First, you need to split the string into separate rows. Then, do a SELECT with ORDER BY NEWID() for the random sort. Finally, use FOR XML PATH('') to concatenate them back:

DECLARE @str VARCHAR(100) = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';

;WITH E1(N) AS( -- 10 ^ 1 = 10 rows
    SELECT 1 FROM(VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))t(N)
),
E2(N) AS(SELECT 1 FROM E1 a CROSS JOIN E1 b), -- 10 ^ 2 = 100 rows
E4(N) AS(SELECT 1 FROM E2 a CROSS JOIN E2 b), -- 10 ^ 4 = 10,000 rows
CteTally(N) AS(
    SELECT TOP(LEN(@str)) ROW_NUMBER() OVER(ORDER BY(SELECT NULL))
    FROM E4
)
SELECT  (
    SELECT TOP(32)
        SUBSTRING(@str, N, 1)
    FROM CteTally t
    ORDER BY NEWID()
    FOR XML PATH('')
) AS Result

ONLINE DEMO

The above is more of a generic random string generator. You can modify it to suit your need. If the requirement will not change, you can simply use this:

DECLARE @str VARCHAR(100) = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
;WITH E1(N) AS( -- 52 Rows
    SELECT 1 FROM( VALUES
        (1),(1),(1),(1),(1),(1),(1),(1),(1),(1),
        (1),(1),(1),(1),(1),(1),(1),(1),(1),(1),
        (1),(1),(1),(1),(1),(1),(1),(1),(1),(1),
        (1),(1),(1),(1),(1),(1),(1),(1),(1),(1),
        (1),(1),(1),(1),(1),(1),(1),(1),(1),(1),
        (1),(1)
    )t(N)
),
CteTally(N) AS(
    SELECT ROW_NUMBER() OVER(ORDER BY(SELECT NULL))
    FROM E1
)
SELECT  (
    SELECT TOP(32)
        SUBSTRING(@str, N, 1)
    FROM CteTally t
    ORDER BY NEWID()
    FOR XML PATH('')
) AS Result

Upvotes: 4

Related Questions