francois rousseau
francois rousseau

Reputation: 121

How to speed up python curve_fit over a 2D array?

I have to use the curve_fit numpy function over a large set of data (5 000 000). So basically I've created a 2D array. First dimension is the number of fittings to perform, second dimension is the number of points used for the fitting.

t = np.array([0 1 2 3 4])

for d in np.ndindex(data.shape[0]):
  try:
    popt, pcov = curve_fit(func, t, np.squeeze(data[d,:]), p0=[1000,100])
  except RuntimeError:
    print("Error - curve_fit failed")

multiprocessing can be used to speed up the full process, but it is still quite slow. Is there a way to use curve_fit in a "vectorized" manner?

Upvotes: 8

Views: 7101

Answers (4)

Jw L
Jw L

Reputation: 11

It seems that user BhanudaySharma is correct and it cannot greatly improve the performance of the curve_fit function and significantly lessen the computation time, if you are not writing a low level code.

However, by utilizing the multiprocessing module, especially multiprocessing.Process(), its performance can be largely improved, shortening the computation time by 76% (when tested in 1,000,000+ row datasets for fitting).

Since you do not provide the original dataset with 5,000,000 rows, I have uploaded a test file with 1,314,300 rows which can be used for performance comparison (test.npy, which can be accessed from this website: https://www.mediafire.com/file/fa5nq6nycvawa4w/test.npy/file).

Besides, your predefined func is not complex, which is not applicable in reality. Suppose we have a quadratic function to fit, which is in the form of f(x, y) = ax^2 + by^2 + cxy + dx + ey + f, and we prepare to utilize 1,000,000+ rows of data for fitting.


Code part:

import numpy as np
import multiprocess as mp
from scipy.optimize import curve_fit
from timeit import default_timer as timer

oneArr = np.load('test.npy')
rowCnt = oneArr.shape[0]
xres, yres = 2, 2
xy = np.array([[-xres, yres], [0, yres], [xres, yres], [-xres, 0], [0, 0], [xres, 0], [-xres, -yres], [0, -yres], [xres, -yres]]).T
funcQuadratic = lambda xy, a, b, c, d, e, f: a * xy[0, :] ** 2 + b * xy[1, :] ** 2 + c * xy[0, :] * xy[1, :] + d * xy[0, :] + e * xy[1, :] + f

## Sequential for loop method
tic = timer()
result_scipy_for_loop = np.zeros((rowCnt, 6))
for row in range(rowCnt):
    result_scipy_for_loop[row, :] = curve_fit(f = funcQuadratic, xdata = xy, ydata = oneArr[row, :], p0 = (1, 1, 1, 1, 1, 1), method = 'lm')[0]
tac = timer()
print("result_scipy_for_loop is:", result_scipy_for_loop, "and its time usage is:", tac - tic, "seconds")

## Multiprocessing Process() function
def curve_fit_process(queue, rowRng):
    result = [curve_fit(f = funcQuadratic, xdata = xy, ydata = oneArr[idx, :], p0 = (1, 1, 1, 1, 1, 1), method = 'lm')[0] for idx in rowRng]
    queue.put(result)

q = mp.Queue()
tic = timer()
num_processes = 30
processes = []
if (rowCnt % num_processes) != 0:
    chunks = [np.arange(idx * (rowCnt // num_processes + 1), idx * (rowCnt // num_processes + 1) + rowCnt // num_processes + 1) for idx in range(num_processes)]
    for chunk in chunks:
        process = mp.Process(target = curve_fit_process, args = (q, chunk, ))
        processes.append(process)
        process.start()
        print("process: ", process.name, '->', process.pid, "starts...")

    ret = [q.get() for process in processes]
    print("ret is:", ret)

    for process in processes:
        process.join()
   
else:
    chunks = [np.arange(idx * (rowCnt // num_processes), idx * (rowCnt // num_processes) + rowCnt // num_processes) for idx in range(num_processes)]
    for chunk in chunks:
        process = mp.Process(target = curve_fit_process, args = (q, chunk, ))
        processes.append(process)
        process.start()
        print("process: ", process.name, '->', process.pid, "starts...")

    ret = [q.get() for process in processes]
    print("ret is:", ret)

    for process in processes:
        process.join()

tac = timer()
print("mp.Process() function time usage is:", tac - tic, "seconds.")

In my case, it only utilizes 119.21885330599616 seconds when using multiprocessing.Process(), compared to using a for loop, which takes 491.4169644650028 seconds.

Hope this is useful for someone in the future.

Upvotes: 1

Bhanuday Sharma
Bhanuday Sharma

Reputation: 433

With my experience, you should supply jacobian to curve_fit, if possible. It will save time by avoiding calling func again and again to calculate the jacobian. It would give you a significant speed boost, especially if you are dealing with a large number of optimizable parameters.

Upvotes: 3

Ed Smith
Ed Smith

Reputation: 13206

Curve fit extends the functionality of scipy.optimize.leastsq which is itself a wrapper for the underlying MINPACK lmdif and lmder fortran routines. It looks like multi-threading is not possible, check out this link, which says,

The underlying Fortran 77 routines (MINPACK lmder.f and lmdif.f) are not reentrant, so the GIL cannot be released. (Thus no chance of parallel processing with threads.)

There is still an open ticket to develop this but it looks like it can not be finished... You would either need to use a different library or write a wrapper/function in a lower level code. There are papers on implementations of parallel Levenberg-Marquardt algorithms.

Maybe there is another solution, use less data or as a rough estimate, you could randomly split your data into parts, curve fit each of the parts on a separate thread (with multi-processor) and take an average of the coefficients at the end.

Upvotes: 5

Daniel
Daniel

Reputation: 12026

One way to speed it up is by adding in some prior knowledge to curve_fit.

If you know the range you expect your parameters to be, and if you don't need precision up to the 100th significant number, you can speed up your computations massively.

Here an example, in which you'd be fitting for param1 and param2:

t = np.array([0 1 2 3 4])
def func(t, param1, param2):
  return param1*t + param2*np.exp(t)

for d in np.ndindex(data.shape[0]):
  try:
    popt, pcov = curve_fit(func, t, np.squeeze(data[d,:]), p0=[1000,100], 
                           bounds=([min_param1, min_param2],[max_param1, max_param2]),
                           ftol=0.5, xtol=0.5)
  except RuntimeError:
    print("Error - curve_fit failed")

Note the extra key arguments bounds, ftol and xtol. You can read about them here.

Upvotes: 7

Related Questions