Reputation: 528
I have two tables: users and results. A user has many results.
I need to generate 10 million records in our users table and I was able to do this using the generate_series
function in postgres.
Now, I want to generate millions of rows in the results table. But I want there to be a certain distribution of these rows. For example, I want 50% of the users to only have 1 result. I want 40% to have 2 results. And I want 10% to have 5 results.
Is there a way to generate this random data in the results table in postgres?
Upvotes: 1
Views: 364
Reputation: 1269953
Yes:
select u.user_id, gs.result
from (select u.*,
ntile(10) over (order by rand()) as decile
from users u
) u cross join lateral
generate_series(1, (case when u.decile <= 5 then 1 when u.decile <= 9 then 2 else 5 end)) gs(result);
This generates the rows. You can fill in with the data you want.
Upvotes: 1