pythonnumpygeometryrandomuniform-distribution

Reputation: 381

Generate a random sample of points distributed on the surface of a unit sphere

I am trying to generate random points on the surface of the sphere using numpy. I have reviewed the post that explains uniform distribution here. However, need ideas on how to generate the points only on the surface of the sphere. I have coordinates (x, y, z) and the radius of each of these spheres.

I am not very well-versed with Mathematics at this level and trying to make sense of the Monte Carlo simulation.

Any help will be much appreciated.

Thanks, Parin

Upvotes: 37

Answers (10)

mikm

Reputation: 160

I would like to classify the solutions here first.

The answer by Soonts are incorrect in that more points are concentrated in the direction of the cube's corners because they took points taken at random from [-1,1]^d and just divided by their length. This will be explained in detail in the next section.
orion elenzil's answer is incorrect in that the points are concentrated around the north and south poles, because θ and φ are taken from a uniform distribution.
user22248229's answer uses rejection sampling which is often considered inefficient.
The answers by tmlen and funny_little_frog use inverse transform sampling for polar coordinates.
The answers by ali_m and thehappycheese use gaussian distribution without polar coordinates.

First, the following code, which utilizes the same method as thehappycheese, is available.

from collections.abc import Sequence

import numpy as np
from numpy.typing import NDArray


def random_hypersphere(
    shape: Sequence[int], dim: int, *, surface: bool = False, rng: np.random.Generator | None = None
) -> NDArray[np.float64]:
    r"""
    Generate random points on / in a unit hypersphere.

    Parameters
    ----------
    shape : Sequence[int]
        The shape of the output array.
    dim : int
        The dimension of the hypersphere.
    surface : bool, optional
        Whether to generate points on the surface of the hypersphere,
        by default False.

    Returns
    -------
    NDArray[np.float64]
        The generated points.

    References
    ----------
        Barthe, F., Guedon, O., Mendelson, S., & Naor, A. (2005).
        A probabilistic approach to the geometry of the \ell_p^n-ball.
        arXiv, math/0503650. Retrieved from https://arxiv.org/abs/math/0503650v1

    """
    rng = np.random.default_rng() if rng is None else rng
    g = rng.normal(loc=0, scale=np.sqrt(1 / 2), size=(dim, *shape))
    if surface:
        result = g / np.linalg.vector_norm(g, axis=0)
    else:
        z = rng.exponential(scale=1, size=shape)[None, ...]
        result = g / np.sqrt(np.sum(g**2, axis=0, keepdims=True) + z)
    return np.nan_to_num(result, nan=0.0)  # not sure if nan_to_num is necessary

Testing

Next, we show that the distribution of the chosen points, when using the incorrect implementation, varies with the direction. Where (1,0,0,...) corresponds to the “direction of the face of the cube” and (1,1,1,...) corresponds to the “direction of the corner of the cube"

def random_hypersphere_wrong(shape: Sequence[int], dim: int, *, rng: np.random.Generator | None = None) -> NDArray[np.float64]:
    rng = np.random.default_rng() if rng is None else rng
    g = np.random.uniform(low=-1, high=1, size=(dim, *shape))
    return g / np.linalg.vector_norm(g, axis=0)


# kde plot for the inner product of the points with (1,0,0,...) and (1,1,1,...)/sqrt(dim)
for dim in [2, 3, 4, 100]:
    random_surface_points = random_hypersphere((100000,), dim, surface=True)
    random_surface_points_wrong = random_hypersphere_wrong((100000,), dim)
    # pick 10 random points
    choice = np.random.choice(1000, 50)
    direction = np.stack([np.ones(dim) / np.sqrt(dim), [1] + [0] * (dim - 1)], axis=1)
    random_surface_points_ip = np.tensordot(
        random_surface_points, direction, axes=(0, 0)
    ).T
    random_hypersphere_wrong_ip = np.tensordot(
        random_surface_points_wrong, direction, axes=(0, 0)
    ).T

    import matplotlib.pyplot as plt
    import seaborn as sns
    from pandas import DataFrame

    fig = plt.figure()
    sns.set()
    df = DataFrame(
        {
            "(1,0,0,...) correct": random_surface_points_ip[0],
            "(1,1,1,...)/sqrt(dim) correct": random_surface_points_ip[1],
            "(1,0,0,...) wrong": random_hypersphere_wrong_ip[0],
            "(1,1,1,...)/sqrt(dim) wrong": random_hypersphere_wrong_ip[1],
        }
    )
    sns.kdeplot(data=df, fill=True, legend=True)
    plt.title(
        f"kde plot for inner product of (1,0,0,...) or (1,1,1,...)/sqrt(dim) \n"
        f" and random points in {dim}D (hyper)sphere"
    )
    plt.savefig(f"comparison_{dim}d.png")

It can be seen that there is no difference between directions when the correct implementation is used, but when the incorrect implementation is used, the distribution is skewed to both ends for the "direction of the corner of the cube".

Upvotes: 0

user22248229

Reputation: 49

In spherical coordinate system, generate random φ uniformly in (0,2π), also generate random θ uniformly in (0, π) for a histography sinθ function. The generated points will be uniformly distributed on the sphere surface.

import numpy as np
import random
import struct

def GenerateRandom():
  data =random.randbytes(2)
  t1=struct.unpack('>1h', data)
  t2=t1[0]
  Randomdata=(t2 + 32768.0) / 65536.0
  

def GetTheta():
    y = -1
    y2=2
    while y2 >= y:
        x = GenerateRandom() * np.pi
        y2 = GenerateRandom()
        y = np.sin(x)
    return x

from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import axes3d
r=50
pointsNumber=1000
phi = list()
theta=list()
for i in range(pointsNumber):
    phit=GenerateRandom() * 2 * np.pi
    phi.append(phit)
    thetat=GetTheta()
    theta.append(thetat)
x =r*np.sin(theta)*np.cos(phi)
y = r*np.sin(theta)*np.sin(phi)
z = r*np.cos(theta)
fig, ax = plt.subplots(1, 1, subplot_kw={'projection':'3d', 'aspect':'equal'})
#ax.plot_wireframe(x, y, z, color='k', rstride=1, cstride=1)
ax.scatter(x, y, z, s=1, c='r')
plt.show()

Upvotes: 0

Artashes

Reputation: 128

This is the FASTEST and mathematically the most BEAUTIFUL way of generating points on a shpere in ANY DIMENSION, just choose dim

dim = 3
radius = 1

x = np.random.normal(0,1,(100,dim)) 

z = np.linalg.norm(x, axis=1) 
z = z.reshape(-1,1).repeat(x.shape[1], axis=1) 

Points = x/z * radius * np.sqrt(dim)

Upvotes: 0

funny_little_frog

Reputation: 21

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

def isotropic_unit_vectors():

    # Note: we must use arccos in the definition of theta to prevent bunching of points toward the poles

    phi = np.random.uniform(0, 2*np.pi)
    theta = np.arccos(1-2*np.random.uniform(0, 1))

    x = np.sin(theta) * np.cos(phi)
    y = np.sin(theta) * np.sin(phi)
    z = np.cos(theta)

    return [x, y, z]

# Now I shall call this function 1500 times in a while loop to plot these points

empty_array = np.empty((1500,3))

i=0
while i<1500:
    empty_array[i] = isotropic_unit_vectors()
    i+=1

x_array = empty_array[:, 0]
y_array = empty_array[:, 1]
z_array = empty_array[:, 2]

fig = plt.figure()
ax = Axes3D(fig)
ax.scatter(x_array, y_array, z_array, s=7)
plt.show()

image of these randomly distributed points on the surface of a sphere.

Upvotes: 2

thehappycheese

Reputation: 353

Google brought me to this ancient question, but I found a better method elsewhere (See here)

So for future travellers; The easiest method appears to be to normalise three gauss/normal distributions as follows:

coords = np.random.normal(size=[3, 1000])
distance_from_origin = np.linalg.norm(coords, ord=2, axis=0)
coords /= distance_from_origin.reshape(1,-1)
coords

Here we use numpy to initialise an array of 1000 coordinates [[x * 1000],[y * 1000],[z * 1000]], each sampled from the default gaussian distribution which is centred on zero. We then use the norm() function with ord=2 (which is just sqrt(x*x+y*y+z*z)) to compute and divide each coordinates by its distance from the origin, producing a unit sphere.

Note: the norm may be zero for some rows! Numpy will not create an error message, some values will just end up being nan. Add the following check to prevent issues downstream;

if (distance_from_origin==0).any():
    raise ValueError("Zero magnitude coordinates. Try again")

The results look very uniform to me; try

import plotly.express as plt
x,y,z = coords
fig = plt.scatter_3d(x=x, y=y, z=z)
fig.update_traces(marker={'size': 2})

Upvotes: 2

Andras Deak -- Слава Україні

Reputation: 35175

Following some discussion with @Soonts I got curious about the performance of the three approaches used in the answers: one with generating random angles, one using normally distributed coordinates, and one rejecting uniformly distributed points.

Here's my attempted comparison:

import numpy as np

def sample_trig(npoints):
    theta = 2*np.pi*np.random.rand(npoints)
    phi = np.arccos(2*np.random.rand(npoints)-1)
    x = np.cos(theta) * np.sin(phi)
    y = np.sin(theta) * np.sin(phi)
    z = np.cos(phi)
    return np.array([x,y,z])

def sample_normals(npoints):
    vec = np.random.randn(3, npoints)
    vec /= np.linalg.norm(vec, axis=0)
    return vec

def sample_reject(npoints):
    vec = np.zeros((3,npoints))
    abc = 2*np.random.rand(3,npoints)-1
    norms = np.linalg.norm(abc,axis=0) 
    mymask = norms<=1
    abc = abc[:,mymask]/norms[mymask]
    k = abc.shape[1]
    vec[:,0:k] = abc
    while k<npoints:
       abc = 2*np.random.rand(3)-1
       norm = np.linalg.norm(abc)
       if 1e-5 <= norm <= 1:  
           vec[:,k] = abc/norm
           k = k+1
    return vec

Then for 1000 points

In [449]: timeit sample_trig(1000)
1000 loops, best of 3: 236 µs per loop

In [450]: timeit sample_normals(1000)
10000 loops, best of 3: 172 µs per loop

In [451]: timeit sample_reject(1000)
100 loops, best of 3: 13.7 ms per loop

Note that in the rejection-based implementation I first generated npoints samples and threw away the bad ones, and I only used a loop to generate the rest of the points. It seemed to be the case that the direct step-by-step rejection takes a longer amount of time. I also removed the check for division-by-zero to have a cleaner comparison with the sample_normals case.

Removing vectorization from the two direct methods puts them into the same ballpark:

def sample_trig_loop(npoints):
    x = np.zeros(npoints)
    y = np.zeros(npoints)
    z = np.zeros(npoints)
    for k in range(npoints):
        theta = 2*np.pi*np.random.rand()
        phi = np.arccos(2*np.random.rand()-1)
        x[k] = np.cos(theta) * np.sin(phi)
        y[k] = np.sin(theta) * np.sin(phi)
        z[k] = np.cos(phi)
    return np.array([x,y,z])

def sample_normals_loop(npoints):
    vec = np.zeros((3,npoints))
    for k in range(npoints):
      tvec = np.random.randn(3)
      vec[:,k] = tvec/np.linalg.norm(tvec)
    return vec

In [464]: timeit sample_trig(1000)
1000 loops, best of 3: 236 µs per loop

In [465]: timeit sample_normals(1000)
10000 loops, best of 3: 173 µs per loop

In [466]: timeit sample_reject(1000)
100 loops, best of 3: 14 ms per loop

In [467]: timeit sample_trig_loop(1000)
100 loops, best of 3: 7.92 ms per loop

In [468]: timeit sample_normals_loop(1000)
100 loops, best of 3: 10.9 ms per loop

Upvotes: 3

ali_m

Reputation: 74262

Based on the last approach on this page, you can simply generate a vector consisting of independent samples from three standard normal distributions, then normalize the vector such that its magnitude is 1:

import numpy as np

def sample_spherical(npoints, ndim=3):
    vec = np.random.randn(ndim, npoints)
    vec /= np.linalg.norm(vec, axis=0)
    return vec

For example:

from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import axes3d

phi = np.linspace(0, np.pi, 20)
theta = np.linspace(0, 2 * np.pi, 40)
x = np.outer(np.sin(theta), np.cos(phi))
y = np.outer(np.sin(theta), np.sin(phi))
z = np.outer(np.cos(theta), np.ones_like(phi))

xi, yi, zi = sample_spherical(100)

fig, ax = plt.subplots(1, 1, subplot_kw={'projection':'3d', 'aspect':'equal'})
ax.plot_wireframe(x, y, z, color='k', rstride=1, cstride=1)
ax.scatter(xi, yi, zi, s=100, c='r', zorder=10)

The same method also generalizes to picking uniformly distributed points on the unit circle (ndim=2) or on the surfaces of higher-dimensional unit hyperspheres.

Upvotes: 57

orion elenzil

Reputation: 5473

(edited to reflect corrections from comments)

i investigated a few constant time approaches to this problem in 2004.

assuming you're working in spherical coordinates where theta is the angle around the vertical axis (eg longitude) and phi is the angle raised up from the equator (eg latitude), then to obtain a uniform distribution of random points on the hemisphere north of the equator you do this:

choose theta = rand(0, 360).
choose phi = 90 * (1 - sqrt(rand(0, 1))).

to get points on a sphere instead of a hemisphere, then simply negate phi 50% of the time.

for the curious, a similar approach holds for generating uniformly-distributed points on a unit-disk:

choose theta = rand(0, 360).
choose radius = sqrt(rand(0, 1)).

i do not have proofs for the correctness of these approaches, but i've used them with lots of success over the past decade or so, and am convinced of their correctness.

some illustration (from 2004) of the various approaches is here, including a visualization of the approach of choosing points on the surface of a cube and normalizing them onto the sphere.

Upvotes: 1

Soonts

Reputation: 21956

Another way that depending on the hardware could be much faster.

Choose a, b, c to be three random numbers each between -1 and 1

Calculate r2 = a^2 + b^2 + c^2

If r2 > 1.0 (=the point isn't in the sphere) or r2 < 0.00001 (=the point is too close to the center, we'll have division by zero while projecting to the surface of the sphere) you discard the values, and pick another set of random a, b, c

Otherwise, you’ve got your random point (relative to center of the sphere):

ir = R / sqrt(r2)
x = a * ir
y = b * ir
z = c * ir

Upvotes: 3

tmlen

Reputation: 9090

Points on the surface of a sphere can be expressed using two spherical coordinates, theta and phi, with 0 < theta < 2pi and 0 < phi < pi.

Conversion formula into cartesian x, y, z coordinates:

x = r * cos(theta) * sin(phi)
y = r * sin(theta) * sin(phi)
z = r * cos(phi)

where r is the radius of the sphere.

So the program could randomly sample theta and phi in their ranges, at uniform distribution, and generate the cartesian coordinates from it.

But then the points get distributed more densley on the poles of the sphere. In order for points to get uniformly distributed on the sphere surface, phi needs to be chosen as phi = acos(a) where -1 < a < 1 is chosen on an uniform distribution.

For the Numpy code it would be the same as in Sampling uniformly distributed random points inside a spherical volume , except that the variable radius has a fixed value.

Upvotes: 10

Generate a random sample of points distributed on the surface of a unit sphere

Answers (10)

Testing

Related Questions