Reputation: 75
I would like to generate some UNIQUE random numbers in Snowflake with a specific starting/ending point. I would like for the numbers to start at 1,000 and end at 1,000,000.
Another requirement is joining a string at the beginning of these numbers.
So far I have been using this statement:
SELECT CONCAT('TEST-' , uniform(10000, 99000, RANDOM()));
Which works as expected and gives me the output of e.g. 'TEST-31633'.
However the problem is I am generating these for a large amount of rows, and I need for them to be completely unique.
I have heard of the 'SEQ1' functions however not sure how I could specify a starting point as well as adding a 'TEST-' with the CONCAT function at the beginning. Ideally they won't be in a strict sequence but differ from each other.
Thank You
Upvotes: 0
Views: 5955
Reputation: 10134
It's not possible to guarantee uniqueness from a random number generator function unless you use some kind of calculated part. For example:
create or replace table testX as
select CONCAT('TEST-' , to_char(seq4(),'0000'),
uniform(100000, 990000, random())) c
from table(generator(rowcount => 1000));
Upvotes: 2
Reputation: 26043
Another shuffle answer:
some setup:
set range_start = 1;
set range_end = 10;
set range_len = $range_end - $range_start;
set name_width = 6;
select
'row_' ||lpad(row_number()over(order by true),$name_width,0) as row_num
from table(generator(ROWCOUNT => $range_len));
gives:
ROW_NUM |
---|
row_000001 |
row_000002 |
row_000003 |
row_000004 |
row_000005 |
row_000006 |
row_000007 |
row_000008 |
row_000009 |
so no repeats, now to shuffle them, just give them and random value and order by that:
select
'row_' ||lpad(row_number()over(order by true),$name_width,0) as row_num
,random() as rnd_order
from table(generator(ROWCOUNT => $range_len))
order by rnd_order;
ROW_NUM | RND_ORDER |
---|---|
row_000002 | -7,719,769,195,714,581,512 |
row_000007 | -7,227,137,977,152,948,936 |
row_000004 | -4,651,924,808,142,841,706 |
row_000005 | -4,571,360,566,799,746,506 |
row_000003 | -1,059,648,113,216,115,246 |
row_000008 | 1,056,363,703,661,911,623 |
row_000001 | 2,281,704,740,829,606,855 |
row_000006 | 5,648,204,845,936,012,521 |
row_000009 | 8,998,464,501,571,068,967 |
change the setup to be in the relm of what you want:
set range_start = 1000;
set range_end = 1000000;
set range_len = $range_end - $range_start;
Now if you only wanted 500K answers instead of 9,999K, slap a limit 500000 on the end..
Upvotes: 1
Reputation: 11066
Any time it's required to assign unique random numbers inside a range, that's best handled through a shuffle rather than generation of new random numbers. Simply generate the sequence, and then shuffle its order.
Depending on the use case, you could simply generate the rows and use order by random()
with a limit on the number selected:
select * from (
select 'TEST-' || (seq4() + 10000)::string as MY_TEST_ID
from table(generator(rowcount => 89000))
) order by random() limit 10000
;
A general-purpose solution to this problem is more complex but certainly possible. A JavaScript UDTF can populate an array with all values in the range and shuffle it. Using a window function on the JavaScript UDTF to ensure that the rows are distributed in a single block, it will allow creation of unique random integers in a range in any SQL statement.
First, create the table functions:
create or replace function UNIQUE_RANDOM_INTEGERS(LBOUND float, UBOUND float)
returns table (UNIQUE_RAND_INT float)
language javascript
strict volatile
as
$$
{
initialize: function (argumentInfo, context) {
this.lBound = argumentInfo.LBOUND.constValue;
this.uBound = argumentInfo.UBOUND.constValue;
this.rSpace = this.uBound - this.lBound + 1;
if (this.lBound >= this.uBound) throw new Error(">>> LBOUND and UBOUND must be constants and UBOUND must be greater than LBOUND <<<");
if (this.rSpace > 25000000) throw new Error (">>> The difference between LBOUND and UBOUND must be 25,000,000 or less.");
this.rands = new Array(this.rSpace);
this.currentRow = 0;
for (let i = 0; i < this.rands.length; i++) {
this.rands[i] = this.lBound + i;
}
this.rands = shuffle(this.rands);
function shuffle(array) {
let currentIndex = array.length, randomIndex;
while (currentIndex != 0) {
randomIndex = Math.floor(Math.random() * currentIndex);
currentIndex--;
[array[currentIndex], array[randomIndex]] = [array[randomIndex], array[currentIndex]];
}
return array;
}
},
processRow: function (row, rowWriter, context) {
//
},
finalize: function (rowWriter, context) {
for (let i = 0; i < this.rSpace; i++) {
rowWriter.writeRow({UNIQUE_RAND_INT:this.rands[i]});
}
},
}
$$;
create or replace function UNIQUE_RANDOM_INTEGERS(LBOUND int, UBOUND int)
returns table(UNIQUE_RAND_INT int)
language sql
as
$$
select UNIQUE_RAND_INT::int from table(UNIQUE_RANDOM_INTEGERS(LBOUND::float, UBOUND::float) over (partition by 1))
$$;
You can then generate unique random numbers in a range using this SQL:
select 'TEST-' || UNIQUE_RAND_INT as MY_RAND_VAL
from table(unique_random_integers(10000, 99000));
Upvotes: 1
Reputation: 1606
did you try with UNIFORM, may be this will work for you. https://docs.snowflake.com/en/sql-reference/functions/uniform.html
select CONCAT('TEST-' , uniform(10000, 99000, random())) from table(generator(rowcount => 100));
Upvotes: 0