hopieman
hopieman

Reputation: 389

Speed up Python Loop append

Hello I have an huge list of values, I want to to find all n values pattern like list[0:30], list[1:31]. And to each value compare percentage to the first, like percentage_change(array[0],array[1]), percentage_change(array[0],array[2]), all the way till the end of pattern. After this, I want to store all the 30 values patterns in an array of patterns to compare to other values in the future.

To do so I have to build a function: To this function, 30 values can be changed to any of my choices by change variable numberOfEntries For each pattern, I do the mean of the 10 next outcomes and store it in an array of outcomes with the same index

#end point is the end of array
#inputs (array, numberOfEntries)
#outPut(list of Patterns, list of outcomes)

y=0
condition= numberOfEntries+1
#each pattern list
pattern=[]
#list of patterns
Patterns=[] 
#outcomes array
outcomes=[]



while (y<len(array)):
    i=1
    while(i<condition):

        #this is percentage change function, I have built it inside to gain speed. Try is used because possibility of 0 division
        try:
            x = ((float(array[y-(numberOfEntries-i)])-array[y-numberOfEntries])/abs(array[y-numberOfEntries]))*100.00
            if x == 0.0:
                x=0.000000001
        except:
            x= 0.00000001
        i+=1
        pattern.append(x)
 #here is the outcomes
     outcomeRange = array[y+5:y+15]
     outcome.append(outcomeRange)
     Patterns.append(pattern)
     #clean pattern array
     pattern=[]
     y+=1

Doing this to an 8559 values array, which is small for the quantity of data I have took me 229.6792.

There is a way of adapt this to multithreading or an way of improve this speed?

EDIT:

To explain better, I have this ohlc data:

                     open      high       low     close      volume
TimeStamp                                                            
2016-08-20 15:50:00  0.003008  0.003008  0.002995  0.003000    6.351215
2016-08-20 15:55:00  0.003000  0.003008  0.003000  0.003008    6.692174
2016-08-20 16:00:00  0.003008  0.003009  0.002996  0.003001   10.813029
2016-08-20 16:05:00  0.003001  0.003000  0.002991  0.002991    4.368509
2016-08-20 16:10:00  0.002991  0.002993  0.002989  0.002990    6.662944
2016-08-20 16:15:00  0.002990  0.003015  0.002989  0.003015    8.495640

I extract this as

array=df['close'].values

Then I apply this array to the function and it will return a list full of lists like this for this particular set of values,

[0.26, 0.03, -0.03, -0.04, ,0.005]

This are percent changes from each row to the begin of the sample, and this is what I call a pattern. I can choose how much entries can have a pattern.

Hope I'm more clear now...

Upvotes: 1

Views: 1499

Answers (1)

Jean-Fran&#231;ois Fabre
Jean-Fran&#231;ois Fabre

Reputation: 140246

First, I would turn the while loop to a for loop, since i is now incremented faster.

for i in range(1,condition):

Now, since y doesn't change within your inner loop, you can optimize your computation from:

x = ((float(array[y-(numberOfEntries-i)])-array[y-numberOfEntries])/abs(close[y-numberOfEntries]))*100.00

to:

x = (float(array[y-(numberOfEntries-i)])-array[y-numberOfEntries]) * z

where z is precomputed before the while/for loop as:

    z = 100.00 / abs(close[y-numberOfEntries])

why?

  • first, z is pre-computed so no computation of abs and access to close array
  • second, z is the inverse of the value to divide, so you can multiply by it. Multiplication is way faster than division.
  • third: no more division by zero is possible since you're no longer dividing. The zerodiv can occur on z outside the loop, and has to be handled accordingly (wrap the whole z + loop thing in try/except and set result to x= 0.00000001 when it occurs, it should be equivalent)

so your inner loop could be:

try:
    z = 100.00 / abs(close[y-numberOfEntries])
    for i in range(1,condition):
        x = (float(array[y-(numberOfEntries-i)])-array[y-numberOfEntries]) * z
except ZeroDivisionError:
    x = 0.00000001
pattern.append(x)

Upvotes: 2

Related Questions