gabboshow
gabboshow

Reputation: 5579

Code optimization python

I wrote the below function to estimate the orientation from a 3 axes accelerometer signal (X,Y,Z)

X.shape
Out[4]: (180000L,)
Y.shape
Out[4]: (180000L,)
Z.shape
Out[4]: (180000L,)

def estimate_orientation(self,X,Y,Z):

    sigIn=np.array([X,Y,Z]).T
    N=len(sigIn)
    sigOut=np.empty(shape=(N,3))
    sigOut[sigOut==0]=None
    i=0
    while i<N:
        sigOut[i,:] = np.arccos(sigIn[i,:]/np.linalg.norm(sigIn[i,:]))*180/math.pi
        i=i+1

    return sigOut

Executing this function with a signal of 180000 samples takes quite a while (~2.2 seconds)... I know that it is not written in a "pythonic way"... Could you help me to optimize the execution time?

Thanks!

Upvotes: 3

Views: 172

Answers (1)

Divakar
Divakar

Reputation: 221714

Starting approach

One approach following an usage of broadcasting, would be like so -

np.arccos(sigIn/np.linalg.norm(sigIn,axis=1,keepdims=1))*180/np.pi

Further optimization - I

We could use np.einsum to replace np.linalg.norm part. Thus :

np.linalg.norm(sigIn,axis=1,keepdims=1)

could be replaced by :

np.sqrt(np.einsum('ij,ij->i',sigIn,sigIn))[:,None]

Further optimization - II

Further boost could be brought in with numexpr module, which works really well with huge arrays and with operations involving trigonometrical functions. In our case that would be arcccos. So, we will use the einsum part as used in the previous optimization section and then use arccos from numexpr on it.

Thus, the implementation would look something like this -

import numexpr as ne

pi_val = np.pi
s = np.sqrt(np.einsum('ij,ij->i',signIn,signIn))[:,None]
out = ne.evaluate('arccos(signIn/s)*180/pi_val')

Runtime test

Approaches -

def original_app(sigIn):
    N=len(sigIn)
    sigOut=np.empty(shape=(N,3))
    sigOut[sigOut==0]=None
    i=0
    while i<N:
        sigOut[i,:] = np.arccos(sigIn[i,:]/np.linalg.norm(sigIn[i,:]))*180/math.pi
        i=i+1
    return sigOut

def broadcasting_app(signIn):
    s = np.linalg.norm(signIn,axis=1,keepdims=1)
    return np.arccos(signIn/s)*180/np.pi

def einsum_app(signIn):
    s = np.sqrt(np.einsum('ij,ij->i',signIn,signIn))[:,None]
    return np.arccos(signIn/s)*180/np.pi

def numexpr_app(signIn):
    pi_val = np.pi
    s = np.sqrt(np.einsum('ij,ij->i',signIn,signIn))[:,None]
    return ne.evaluate('arccos(signIn/s)*180/pi_val')

Timings -

In [115]: a = np.random.rand(180000,3)

In [116]: %timeit original_app(a)
     ...: %timeit broadcasting_app(a)
     ...: %timeit einsum_app(a)
     ...: %timeit numexpr_app(a)
     ...: 
1 loops, best of 3: 1.38 s per loop
100 loops, best of 3: 15.4 ms per loop
100 loops, best of 3: 13.3 ms per loop
100 loops, best of 3: 4.85 ms per loop

In [117]: 1380/4.85 # Speedup number
Out[117]: 284.5360824742268

280x speedup there!

Upvotes: 6

Related Questions