Reputation: 1316
I have the following table called original_table
:
name height age
personA 180 21
personB 190 37
personC 168 27
personD 182 56
...
My goal is to create two random samples of size 100 from the original_table
such that the mean of age is normally distributed around 25, and height average is close to 175. Basically, a person with age 25 and height 175 has the highest chance of being in the tables, but not guaranteed.
Upvotes: 0
Views: 49
Reputation: 565
The trick is to find the normal distribution function from other languages, and then convert it to mysql syntax. For example, convert from the following java code to mysql
//java.util.Random
synchronized public double nextGaussian() {
// See Knuth, ACP, Section 3.4.1 Algorithm C.
if (haveNextNextGaussian) {
haveNextNextGaussian = false;
return nextNextGaussian;
} else {
double v1, v2, s;
do {
v1 = 2 * nextDouble() - 1; // between -1 and 1
v2 = 2 * nextDouble() - 1; // between -1 and 1
s = v1 * v1 + v2 * v2;
} while (s >= 1 || s == 0);
double multiplier = StrictMath.sqrt(-2 * StrictMath.log(s)/s);
nextNextGaussian = v2 * multiplier;
haveNextNextGaussian = true;
return v1 * multiplier;
}
}
with recursive t0(v1, v2, s, num1, num2, rn) as (
select @v1:= rand() * 2 - 1,
@v2:= rand() * 2 - 1,
@s:= @v1*@v1 + @v2*@v2,
sqrt(-2 * log(@s) / (@s)) * @v1,
sqrt(-2 * log(@s) / (@s)) * @v2,
@rn:=case when @s < 1 and @s > 0 then 1 else 0 end
union all
select @v1:= rand() * 2 - 1,
@v2:= rand() * 2 - 1,
@s:= @v1*@v1 + @v2*@v2,
sqrt(-2 * log(@s) / (@s)) * @v1,
sqrt(-2 * log(@s) / (@s)) * @v2,
@rn:=case when @s < 1 and @s > 0 then 1 else 0 end + @rn
from t0
where @rn < 100
)
select 175 + t1.num1 * 10,
175 + t1.num2 * 10,
25 + t1.num1 * 8,
25 + t1.num2 * 8
from t0 t1
where t1.num1 is not null
order by t1.num1
;
Upvotes: 1