Reputation: 2299
I have in mind the following experiment to run in Matlab and I am asking for an help to implement step (3). Any suggestion would be very appreciated.
(1) Consider the random variables X
and Y
both uniformly distributed on [0,1]
(2) Draw N
realisation from the joint distribution of X
and Y
assuming that X
and Y
are independent (meaning that X
and Y
are uniformly jointly distributed on [0,1]x[0,1]
). Each draw will be in [0,1]x[0,1]
.
(3) Transform each draw in [0,1]x[0,1]
in a draw in [0,1]
using the Hilbert space filling curve: under the Hilbert curve mapping, the draw in [0,1]x[0,1]
should be the image of one (or more because of surjectivity) point(s) in [0,1]
. I want pick one of these points. Is there any pre-built package in Matlab doing this?
I found this answer which I don't think does what I want as it explains how to obtain the Hilbert value of the draw (curve length from the start of curve to the picked point)
On wikipedia I found this code in C language (from (x,y)
to d
) which, again, does not fulfil my question.
Upvotes: 0
Views: 321
Reputation: 486
I will focus only on your last point
(3) Transform each draw in
[0,1]x[0,1]
in a draw in[0,1]
using the Hilbert space filling curve: under the Hilbert curve mapping, the draw in[0,1]x[0,1]
should be the image of one (or more because of surjectivity) point(s) in[0,1]
. I want pick one of these points. Is there any pre-built package in Matlab doing this?
As far as I know, there aren't pre-built packages in Matlab doing this, but the good news is that the code on wikipedia can be called from MATLAB, and it is as simple as putting together the conversion routine with a gateway function in a xy2d.c
file:
#include "mex.h"
// source: https://en.wikipedia.org/wiki/Hilbert_curve
// rotate/flip a quadrant appropriately
void rot(int n, int *x, int *y, int rx, int ry) {
if (ry == 0) {
if (rx == 1) {
*x = n-1 - *x;
*y = n-1 - *y;
}
//Swap x and y
int t = *x;
*x = *y;
*y = t;
}
}
// convert (x,y) to d
int xy2d (int n, int x, int y) {
int rx, ry, s, d=0;
for (s=n/2; s>0; s/=2) {
rx = (x & s) > 0;
ry = (y & s) > 0;
d += s * s * ((3 * rx) ^ ry);
rot(s, &x, &y, rx, ry);
}
return d;
}
/* The gateway function */
void mexFunction( int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[])
{
int n; /* input scalar */
int x; /* input scalar */
int y; /* input scalar */
int *d; /* output scalar */
/* check for proper number of arguments */
if(nrhs!=3) {
mexErrMsgIdAndTxt("MyToolbox:arrayProduct:nrhs","Three inputs required.");
}
if(nlhs!=1) {
mexErrMsgIdAndTxt("MyToolbox:arrayProduct:nlhs","One output required.");
}
/* get the value of the scalar inputs */
n = mxGetScalar(prhs[0]);
x = mxGetScalar(prhs[1]);
y = mxGetScalar(prhs[2]);
/* create the output */
plhs[0] = mxCreateDoubleScalar(xy2d(n,x,y));
/* get a pointer to the output scalar */
d = mxGetPr(plhs[0]);
}
and compile it with mex('xy2d.c')
.
The above implementation
[...] assumes a square divided into n by n cells, for n a power of 2, with integer coordinates, with (0,0) in the lower left corner, (n-1,n-1) in the upper right corner.
In practice, a discretization step is required before applying the mapping. As in every discretization problem, it is crucial to choose the precision wisely. The snippet below puts everything together.
close all; clear; clc;
% number of random samples
NSAMPL = 100;
% unit square divided into n-by-n cells
% has to be a power of 2
n = 2^2;
% quantum
d = 1/n;
N = 0:d:1;
% generate random samples
x = rand(1,NSAMPL);
y = rand(1,NSAMPL);
% discretization
bX = floor(x/d);
bY = floor(y/d);
% 2d to 1d mapping
dd = zeros(1,NSAMPL);
for iid = 1:length(dd)
dd(iid) = xy2d(n, bX(iid), bY(iid));
end
figure;
hold on;
axis equal;
plot(x, y, '.');
plot(repmat([0;1], 1, length(N)), repmat(N, 2, 1), '-r');
plot(repmat(N, 2, 1), repmat([0;1], 1, length(N)), '-r');
figure;
plot(1:NSAMPL, dd);
xlabel('# of sample')
Upvotes: 0
Reputation: 12592
You could compute the hilbert curve from f(x,y)=z. Basically it's a hamiltonian path traversal. You can find a good description at Nick's spatial index hilbert curve quadtree blog. Or take a look at monotonic n-ary gray code. I've written an implementation based on Nick's blog in php:http://monstercurves.codeplex.com.
Upvotes: 1
Reputation: 2153
Edit This answers the original request for a transformation f(x,y) -> t ~ U[0,1] given x,y ~ U[0,1], and additionally for x and y correlated. The updated question asks specifically for a Hilbert curve, H(x,y) -> t ~ U[0,1] and only for x,y ~ U[0,1] so this answer is no longer relevant.
Consider a random uniform sequence in [0,1] r1, r2, r3, .... You are assigning this sequence to pairs of numbers (x1,y1), (x2,y2), .... What you are asking for is a transformation on pairs (x,y) which yield a uniform random number in [0,1].
Consider the random subsequence r1, r3, ... corresponding to x1, x2, .... If you trust that your number generator is random and uncorrelated in [0,1], then the subsequence x1, x2, ... should also be random and uncorrelated in [0,1]. So the rather simple answer to the first part of your question is a projection onto either the x or y axis. That is, just pick x.
Next consider correlations between x and y. Since you haven't specified the nature of the correlation, let's assume a simple scaling of the axes, such as x' => [0, 0.5], y' => [0, 3.0], followed by a rotation. The scaling doesn't introduce any correlation since x' and y' are still independent. You can generate it easily enough with a matrix multiply:
M1*p = [x_scale, 0; 0, y_scale] * [x; y]
for matrix M1 and point p. You can introduce a correlation by taking this stretched form and rotating it by theta:
M2*M1*p = [cos(theta), sin(theta); -sin(theta), cos(theta)]*M1*p
Putting it all together with theta = pi/4, or 45 degrees, you can see that larger values of y are correlated with larger values of x:
cos_t = sin_t = cos(pi/4); % at 45 degrees, sin(t) = cos(t) = 1/sqrt(2)
M2 = [cos_t, sin_t; -sin_t, cos_t];
M1 = [0.5, 0.0; 0.0, 3.0];
p = random(2,1000);
p_prime = M2*M1*p;
plot(p_prime(1)', p_prime(2)', '.');
axis('equal');
The resulting plot* shows a band of uniformly distributed numbers at a 45 degree angle:
Further transformations are possible with shear, and if you are clever about it, translation (OpenGL uses 4x4 transformation matrices so that translation can be represented as a linear transform matrix, with an extra dimension added before the transformation steps and removed before they are done).
Given a known affine correlation structure, you can transform back from random points (x',y') to points (x,y) where x and y are independent in [0,1] by solving Mk*...*M1 p = p_prime
for p, or equivalently, by setting p = inv(Mk*...*M1) * p_prime
, where p=[x;y]
. Again, just pick x, which will be uniform in [0,1]. This doesn't work if the transformation matrix is singular, e.g., if you introduce a projection matrix Mj into the mix (though if the projection is the first step you can still recover).
* You may notice that the plot is from python rather than matlab. I don't have matlab or octave sitting in front of me right now, so I hope I got the syntax details right.
Upvotes: 1
Reputation: 4077
EDIT This answer does not address updated version of the question, which explicitly asks about constructing Hilbert curve. Instead, this answer addresses a related question on construction of bijective mapping, and the relation to uniform distribution.
Your problem in not really well defined. If you only need the resulting distribution to be uniform, nothing is stopping you from simply picking f:(X,Y)->X
. Result would be uniform regardless of whether X
and Y
are correlated. From your post I can only presume that what you want, in fact, is for the resulting transformation to be bijective, or as close to it as possible given machine precision limitations.
Worth noting that unless you need the algorithm that is best in preserving locality (which is clearly not required for resulting distribution to be bijective, not to mention uniform), there's no need to bother constructing Hilbert curves that you mention in your question. They have just as much to do with the solution as any other space-filling curve, and are incredibly computationally intensive.
So assuming you're looking for a bijective mapping, your question is equivalent to asking whether the set of points in a [unit] square has the same cardinality as the set of points in a [unit] line segment, and if it is, how to construct that bijection, i.e. 1-to-1 correspondence. The intuition says the square should have a higher cardinality, and Cantor spent 3 years trying to prove that, eventually proving quite the opposite - that these sets are in fact equinumerous. He was so surprised at his discovery that he wrote:
I see it, but I don't believe it!
The most commonly referred to bijection, fulfilling** this criteria, is the following. Represent x
and y
in their decimal form, i.e. x=0. x1 x2 x3 x4 x5..., and y=0. y1 y2 y3 y4 y5..., and let f:(X,Y)->Z
be z=0. x1 y1 x2 y2 x3 y3 x4 y4 x5 y5..., i.e. alternating the decimals of the two numbers. The idea behind the bijection is trivial, though a rigorous proof requires quite a bit of prior knowledge.
** The caveat is that if we take e.g. x = 1/3 = 0.33333...
and y = 1/5 = 0.199999... = 0.200000...
, we can see there are two sequences corresponding to them: z = 0.313939393939...
and z = 0.323030303030...
. To overcome this obstacle we have to prove that adding a countable set to an uncountable one does not change the cardinality of the latter.
In reality we have to deal with machine precision and not pure math, which strictly speaking means both sets are actually finite and hence not equinumerous (assuming you store result with the same precision as original numbers). Which means we're simply forced to do some assumptions and loose some information, such as, in this case, the last half of significant digits of x
and y
. That is, unless we use a different data type that allows to store result with a double precision, compared to that of original variables.
Finally, sample implementation in Matlab:
x = rand();
y = rand();
chars = [num2str(x, '%.17f'); num2str(y, '%.17f')];
z = str2double(['0.' reshape(chars(:,3:end), 1, [])]);
>> cellstr(['x=' num2str(x, '%.17f'); 'y=' num2str(y, '%.17f'); 'z=' num2str(z, '%.17f')])
ans =
'x=0.65549803980353738'
'y=0.10975505072305158'
'z=0.61505947958500362'
Upvotes: 2