henry
henry

Reputation: 965

Python function to find the numeric volume integral?

Goal

I would like to compute the 3D volume integral of a numeric scalar field.

Code

For this post, I will use an example of which the integral can be exactly computed. I have therefore chosen the following function:

Integral function

In Python, I define the function, and a set of points in 3D, and then generate the discrete values at these points:

import numpy as np


# Make data.
def function(x, y, z):
    return x**y**z

N = 5
grid = np.meshgrid(
    np.linspace(0, 1, N),
    np.linspace(0, 1, N),
    np.linspace(0, 1, N)
)

points = np.vstack(list(map(np.ravel, grid))).T

x = points[:, 0]
y = points[:, 1]
z = points[:, 2]

values = [function(points[i, 0], points[i, 1], points[i, 2])
          for i in range(len(points))]

Question

How can I find the integral, if I don't know the underlying function, i.e. if I only have the coordinates (x, y, z) and the values?

Upvotes: 15

Views: 1887

Answers (4)

Mike Holcomb
Mike Holcomb

Reputation: 413

Based on your requirements, it sounds like the most appropriate technique would be Monte Carlo integration:

# Step 0 start with some empirical data
observed_points = np.random.uniform(0,1,size=(10000,3))

unknown_fn = lambda x: np.prod(x) # just used to generate fake values

observed_values = np.apply_along_axis(unknown_fn, 1, observed_points) 

K = 1000000

# Step 1 - assume that f(x,y,z) can be approximated by an interpolation
# of the data we have (you could get really fancy with the 
# selection of interpolation method - we'll stick with straight lines here)

from scipy.interpolate import LinearNDInterpolator
f_interpolate = LinearNDInterpolator(observed_points, observed_values)

# Step 2 randomly sample from within convex hull of observed data
# Step 2a - Uniformly sample from bounding 3D-box of data
lower_bounds = observed_points.min(axis=0)
upper_bounds = observed_points.max(axis=0)

sampled_points = np.random.uniform(lower_bounds, upper_bounds,size=(K, 3))
# Step 2b - Reject points outside of convex hull...
# Luckily, we get a np.nan from LinearNDInterpolator in this case

sampled_values = f_interpolate(sampled_points)
rejected_idxs = np.argwhere(np.isnan(sampled_values))

# Step 2c - Remember accepted values of estimated f(x_i, y_i, z_i)
final_sampled_values = np.delete(sampled_values, rejected_idxs, axis=0)

# Step 3 - Calculate estimate of volume of observed data domain
#  Since we sampled uniformly from the convex hull of data domain,
#  each point was selected with P(x,y,z)= 1 / Volume of convex hull
volume = scipy.spatial.ConvexHull(observed_points).volume

# Step 4 - Multiply estimated volume of domain by average sampled value
I_hat = volume * final_sampled_values.mean()
print(I_hat)

For a derivation of why this works see this: https://cs.dartmouth.edu/wjarosz/publications/dissertation/appendixA.pdf

Upvotes: 0

Damir Devetak
Damir Devetak

Reputation: 762

The first answer explains nicely the principal approach to handle this. Just wanted to illustrate an alternative way by showing the power of sklearn package and machine learning regression.

Doing the meshgrid in 3D gives a very large numpy array,

import numpy as np

N = 5
xmin, xmax = 0, 1
ymin, ymax = 0, 1
zmin, zmax = 0, 1

x = np.linspace(xmin, xmax, N)
y = np.linspace(ymin, ymax, N)
z = np.linspace(zmin, zmax, N)

grid = np.array(np.meshgrid(x,y,z, indexing='ij'))
grid.shape = (3, 5, 5, 5) # 2*5*5*5 = 250 numbers

Which is visually not very intuitive with 250 numbers. With different possible indexing ('ij' or 'xy'). Using regression we can get the same result with few input points (15-20).

# building random combinations from (x,y,z)

X = np.random.choice(x, 20)[:,None]
Y = np.random.choice(y, 20)[:,None]
Z = np.random.choice(z, 20)[:,None]

xyz = np.concatenate((X,Y,Z), axis = 1)
data = np.multiply.reduce(xyz, axis = 1)

So the input (grid) is just a 2D numpy array,

xyz.shape
(20, 3)

With the corresponding data,

data.shape = (20,)

Now the regression function and integration,

from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import Pipeline
from scipy import integrate

pipe=Pipeline([('polynomial',PolynomialFeatures(degree=3)),('modal',LinearRegression())])
pipe.fit(xyz, data)

def func(x,y,z):
    return pipe.predict([[x, y, z]])

ranges = [[0,1], [0,1], [0,1]]

result, error = integrate.nquad(func, ranges)
print(result)
0.1257

This approach is useful with limited number of points.

Upvotes: 2

Seon
Seon

Reputation: 3975

A nice way to go about this would be using scipy's tplquad integration. However, to use that, we need a function and not a cloud point.

An easy way around that is to use an interpolator, to get a function approximating our cloud point - we can for example use scipy's RegularGridInterpolator if the data is on a regular grid:

import numpy as np
from scipy import integrate
from scipy.interpolate import RegularGridInterpolator

# Make data.
def function(x,y,z):
    return x*y*z

N = 5
xmin, xmax = 0, 1
ymin, ymax = 0, 1
zmin, zmax = 0, 1
x = np.linspace(xmin, xmax, N)
y = np.linspace(ymin, ymax, N)
z = np.linspace(zmin, zmax, N)

values = function(*np.meshgrid(x,y,z, indexing='ij'))

# Interpolate:
function_interpolated = RegularGridInterpolator((x, y, z), values)

# tplquad integrates func(z,y,x)
f = lambda z,y,x : my_interpolating_function([z,y,x])

result, error = integrate.tplquad(f, xmin, xmax, lambda _: ymin, lambda _:ymax,lambda *_: zmin, lambda *_: zmax)

In the example above, we get result = 0.12499999999999999 - close enough!

Upvotes: 9

C Hecht
C Hecht

Reputation: 1016

The easiest way to achieve what you are looking for is probably scipy's integration function. Here your example:

from scipy import integrate

# Make data.
def func(x,y,z):
    return x**y**z

ranges = [[0,1], [0,1], [0,1]]

result, error = integrate.nquad(func, ranges)

Are you aware that the function that you created is different from the one that you show in the image. The one you created is an exponential (x^y^z) while the one that you are showing is just multiplications. If you want to represent the function in the image, use

def func(x,y,z):
    return x*y*z

Hope this answers your question, otherwise just write a comment!

Edit:

Misread your post. If you only have the results, and they are not regularly spaced, you would have to figure out some form of interpolation (i.e. linear) and a lookup-table. If you do not know how to create that, let me know. The rest of the stated answer could still be used if you define func to return interpolated values from your original data

Upvotes: 4

Related Questions