ktitimbo
ktitimbo

Reputation: 83

multiprocessing nested for

I'm currently trying to get a result out of the process showed below. However, it is taking too long for the number of steps needed. I would like to speed up the result. How can I implement multiprocessing for this situation?

Within the class I am building, I have the following definition

    def class_corr(self,Delta,xi_q,Q,counter):
        to = self.t
        npts = 512
        x0 = 12
        dx =2*x0/npts
        norm=0
        classic=0
        for n in range(0,npts+1):
            qi = -x0+n*dx
            for m in range(0,npts+1):
                pj = -x0+m*dx
                for k in range(0,npts+1):
                    xi_pk = -x0+k*dx
                    f1    += dx**3*wt(qi,pj,qo,po,to)*F(qi,xi_pk,Delta, Q)
                    fn += dx**3*wt(qi,pj,qo,po,to)*G(qi,xi_pk,xi_q,Delta,Q)
        if counter:
            return  [f1, fn/f1]
        return  fn/f1

Is it even reasonable to use multiprocessing?

So far, I have checked these:

  1. Multiprocessing nested python loops
  2. Python multiprocessing for dummies

but I haven't been able to implement those nor gotten a solution.

Upvotes: 0

Views: 91

Answers (1)

user1245262
user1245262

Reputation: 7505

As I think about it, what you really have here is a dynamic programming style problem. You keep recalculating the same terms. For instance, you only need to calculate dx^3 once, yet you do it npts^3 times. Similarly, you only need to calculate each 3*wt(qi,pj,qo,po,to) term once, but you do it 2*npts times.

Try something like:

def class_corr(self,Delta,xi_q,Q,counter):
    to = self.t
    npts = 512
    x0 = 12
    dx =2*x0/npts
    dx3 = dx**3
    norm=0
    classic=0
    for n in range(0,npts+1):
        qi = -x0+n*dx
        for m in range(0,npts+1):
            pj = -x0+m*dx
            wt_curr = wt(qi,pj,qo,po,to)
            for k in range(0,npts+1):
                xi_pk = -x0+k*dx
                f1 += dx3*wt_curr*F(qi,xi_pk,Delta, Q)
                fn += dx3*wt_curr*G(qi,xi_pk,xi_q,Delta,Q)
    if counter:
        return  [f1, fn/f1]
    return  fn/f1

Additionally, you calculate F & G npts more times than you need to. It looks as though each only varies with qi and xi_pk (xi_q, Delta and Q don't seem to vary in this method). If you tried using some sort of 2-layer defaultdict to record the qi and xi_pk values for which you've already calculated F (or G), you would then save lots of unnecessary calls and calculations of F (or G).

(PS - I know this wasn't the approach you're looking for, but I think it addresses the core of your problem. You're spending a lot of time recalculating terms you've already calculated.)

Upvotes: 2

Related Questions