Python interpolation of 3D data set

Question

Does anyone know how to interpolate a 3D data set with Python? I would like to interpolate in the x, y, and z dimension to obtain the correct value of the 4th column. Thanks a lot!

The data looks like the following:

x   y       z           
75  1E+00   3.7594E-10  1.0199E-08
75  3E+00   1.1278E-09  3.0379E-08
75  1E+01   3.7593E-09  1.0077E-07
75  3E+01   1.1278E-08  3.0152E-07
75  1E+02   3.7593E-08  1.0032E-06
75  3E+02   1.1278E-07  3.0063E-06
100 1E+00   2.8216E-10  2.0714E-08
100 3E+00   8.4641E-10  6.1573E-08
100 1E+01   2.8214E-09  2.0468E-07
100 3E+01   8.4604E-09  5.4807E-07
100 1E+02   2.8197E-08  1.6292E-06
100 3E+02   8.4587E-08  4.4588E-06

Filip Malczak · Accepted Answer

OK, what you need is a regression (see: Wolfram, Wiki) or aproximation (see: Wiki). General idea of both is exactly what you need: finding a function that will match a function for which you have samples as close as possible.

There are several methods, you can google them up as you now know needed terms.

Here are several simple examples. Remember, that choosing a way of aproximating is important and problem-dependent, there is no one method to do this correctly.

Method 1

You've got point P and want to find a value of function f for it. If you already know value of f for this point, then return it. Else, find 2^d points that you know value for, closest to P, where d is number of dimensions (number of function arguments). For example, for 2 dimensions ( (x, y) points) you'd find 4 points closest to P.

You calculate distance between them and point P, and get 2^d values (1 per point), then you calculate sum of those distances.

You calculate f(P) = f(point0)*distance(point0)/sumOfDistance + f(point1)*distance(point1)/sumOfDistance + ... f(pointd-1)*distance(pointd-1)/sumOfDistance. As the result you get weighted average value of function around this point.

Method 2

You minimize function of error for those points for some equation. For example, you may assume that g(x, y, z) = ax + by + cz + d can be used to describe your 4th column. You have to figure out how this function may look like and choose its form by yourself (you can use expotential function, logarithm, polinomials, etc). Then you define function of error as e(a, b, c, d) as sum of differences between real values (taken from your data) and values of g using those a, b, c, d, squared (I mean, the differences are squared, not a, b, c, d). Square is optional, but it usually works better. Now all you have to do is minimize function e, which means "find values a, b, c, d for which e(a, b, c, d) will be as small as possible".

How do you do it? If you function is simple, you can differentiate it, find all zeros of differential, calculate value of e in those zero points and choose smallest. We've got a problem when your function is quite complicated. There is a solution called (meta)heuristics, which are very useful in such task. You can read about such heuristics as evolutionary/genetic algorithm (those two are close, but not the same), particle swarm optimization, tabu search, simulated annealing. I won't describe them here, it is a topic for at least one computer science masters course.

What about libraries?

Ummm... I'm not really sure if there is something like that, but if there is, my guess is you'll find it in numpy or scipy. If not, it is quite doable to implement by hand, though you have to be cautious and test it really well (bugs in such tasks are terribly hard to find).

Python interpolation of 3D data set

Answers (1)

Related Questions