How to iterate over previous rows to compare values in a Pandas DataFrame

Question

I have a Dataframe from pandas like this:

import pandas as pd
raw_data = [{'Date': '1-10-19', 'Price':7, 'Check': 0}, 
            {'Date': '2-10-19','Price':8.5, 'Check': 0}, 
            {'Date': '3-10-19','Price':9, 'Check': 1}, 
            {'Date': '4-10-19','Price':50, 'Check': 1}, 
            {'Date': '5-10-19','Price':80, 'Check': 1}, 
            {'Date': '6-10-19','Price':100, 'Check': 1}]
df = pd.DataFrame(raw_data)
df.set_index('Date')

This is what it looks like:

           Price  Check
Date        
1-10-19     7.0      0
2-10-19     8.5      0 
3-10-19     9.0      1
4-10-19     50.0     1 
5-10-19     80.0     1
6-10-19     100.0    1

Now what I'm trying to do is that for each row where 'Check" is 1, I want to check the number of rows prior to that row in which the price was less than 10% of that row's price. For example, for the 6th row where the price is 100, I want to iterate over the the previous rows and count the rows until the price is less than 10 (10% of 100), which in this case would 3 rows prior where the price is 9. Then want to save the results in a new column.

The final result would look like this:

           Price  Check  Rows_till_small
Date        
1-10-19     7.0      0    NaN
2-10-19     8.5      0    NaN
3-10-19     9.0      1    Nan
4-10-19     50.0     1    NaN
5-10-19     80.0     1    4
6-10-19     100.0    1    3

I've thought a lot about how I could do this using some kind of Rolling function, but I don't think it's possible. I've also thought about iterating through the entire DataFrame using iterrows or itertuples, but I can't imagine of a way to do it without being extremely inefficient.

Artiom Kozyrev · Accepted Answer

You can solve the issue the following way:

import pandas as pd
raw_data = [{'Date': '1-10-19', 'Price': 7, 'Check': 0},
            {'Date': '2-10-19', 'Price': 8.5, 'Check': 0},
            {'Date': '3-10-19', 'Price': 9, 'Check': 1},
            {'Date': '4-10-19', 'Price': 50, 'Check': 1},
            {'Date': '5-10-19', 'Price': 80, 'Check': 1},
            {'Date': '6-10-19', 'Price': 100, 'Check': 1}]
df = pd.DataFrame(raw_data)

new_column = [None] * len(df["Price"])  # create new column

for i in range(len(df["Price"])):
    if df['Check'][i] == 1:
        percent_10 = df['Price'][i] * 0.1
        for j in range(i, -1, -1):
            print(j)
            if df['Price'][j] < percent_10:
                new_column[i] = i - j
                break


df["New"] = new_column  # add new column

print(df)

Hope the answer is useful for you, feel free to ask questions.

How to iterate over previous rows to compare values in a Pandas DataFrame

Answers (2)

Related Questions