Mithun Manohar
Mithun Manohar

Reputation: 586

SQL random sampling into equal groups

I need to randomly sample users in a table into 4 equal groups using SQL from a table. For that I did the below:

WITH randomised_users AS (
       SELECT * 
         FROM users_table
      ORDER BY RANDOM()
) SELECT *, 
         ntile(4) OVER(ORDER BY (SELECT 1)) AS tile_nr
    FROM randomised_users 

Is this approach of sampling correct or is there a chance for bias in the 4 groups created from this?

Upvotes: 0

Views: 1290

Answers (1)

Rajat
Rajat

Reputation: 5803

What you have looks fine to me. You don't need a subquery BTW. This will do just fine

select *, ntile(4) over (order by random())

Snowflake doesn't guarantee the query will reproduce the same result set even if you provide a random seed so make sure to dump any intermediate result set into a temp table if you plan on re-using it.

Upvotes: 1

Related Questions