Leo
Leo

Reputation: 1168

Use apply function to simpify nested for loop?

I have a dataframe with a column of values (column 'One'). For each pair of values given by (product(sma1, sma2)), I want to:

The code below shows my approach. Is it possible to do the same thing removing the nested for loop (with apply maybe)? would that be faster / better?

from itertools import product 
import pandas as pd 


data = pd.DataFrame( [3,2,5,8,5,12,7,8,9,10,11,12,13,14,17,19,15,18] ,columns=['One'])
sma1 = range(2, 5)  
sma2 = range(5, 8)
results=pd.DataFrame()

for SMA1, SMA2 in product(sma1, sma2): #for each pair SMA1, SMA2, perfomr the following
    data['SMA1'] = data['One'].rolling(SMA1).mean()
    data['SMA2'] = data['One'].rolling(SMA2).mean()
    
    data.dropna(inplace=True)
    
    for i in range( len(data)):
        #Condition A:
        if data.iloc[i, 1] > data.iloc[i, 2] and data.iloc[i-1, 1] < data.iloc[i-1,2]: 
          #Buy
          price = data.iloc[i, 0]
        #Condition B
        elif data.iloc[i, 1] < data.iloc[i, 2] and data.iloc[i-1, 1] > data.iloc[i-1,2]:
          #Sell
          price = data.iloc[i, 0] 
        #if neither condition is true, dont append anything 
        else: 
          continue
          
        results = results.append(pd.DataFrame(
              {'SMA1': SMA1,
               'SMA2': SMA2,
               'price': price,
               
              },
               index=[0]), ignore_index = True)

    
results

Out:

    SMA1 SMA2   price
0   2    5      5
1   2    5      8
2   2    5      9
3   2    5      18

EIDT from Hugolmn's answer I see the issue with using [i-1]. Howver with your approach I am not getting the right answer, I have simpliefied the probelm to just finding if the conditions A, B are satisfied for 2 columns.

data = pd.DataFrame({'col1': [3,2,6,8,6,11], 'col2': [3,3,5,8,5,12]})

#columns with shifted data
data['col1_shift']=data['col1'].shift(1)
data['col2_shift']=data['col2'].shift(1)

#condition A  
data['Con1_col1>col2'] = data['col1']>data['col2']
data['Con1_col1<col2_shift'] = data['col1_shift'] < data['col2_shift']
data['ConA'] = data['Con1_col1>col2'] & data['Con1_col1<col2_shift']

#condition B
data['Con2_col1<col2'] = data['col1']<data['col2']
data['Con2_col1>col2_shift'] = data['col1_shift'] > data['col2_shift']
data['ConB'] = data['Con2_col1<col2'] & data['Con2_col1>col2_shift']


# data['part1']= ((data.col1 - data.col2) > 0)
data['Hugolmn_method']= ((data.col1 - data.col2) >= 0).diff() > 0
data['expected']= data['ConA'] | data['ConB']

data

Out:

   col1  col2  col1_shift  col2_shift  Con1_col1>col2  Con1_col1<col2_shift  \
0     3     3         NaN         NaN           False                 False   
1     2     3         3.0         3.0           False                 False   
2     6     5         2.0         3.0            True                  True   
3     8     8         6.0         5.0           False                 False   
4     6     5         8.0         8.0            True                 False   
5    11    12         6.0         5.0           False                 False   

    ConA  Con2_col1<col2  Con2_col1>col2_shift   ConB  Hugolmn_method  \
0  False           False                 False  False           False   
1  False            True                 False  False            True   
2   True           False                 False  False            True   
3  False           False                  True  False           False
4  False           False                 False  False           False
5  False            True                  True   True            True   

   expected  
0     False  
1     False  
2      True  
3     False  
4     False  
5      True  

Upvotes: 0

Views: 98

Answers (1)

Hugolmn
Hugolmn

Reputation: 1560

I inspected your code and come up with these comments:

  • You don't need a loop to iterate over data
  • You can append a dict to a dataframe directly
  • You're result is wrong for price=5, it was comparing the index 0 with the index -1, so between the first and last row of data. This is probably not an expected behavior

Instead, I would suggest you to use .shift() to do tests with previous rows.

data.iloc[i, 1] > data.iloc[i, 2] and data.iloc[i-1, 1] < data.iloc[i-1,2]

# would become
(data.SMA1 > data.SMA2) & (data.SMA1.shift() < data.SMA2.shift())

Upvotes: 0

Related Questions