TeoK
TeoK

Reputation: 501

apply my function if...else statement with condition

My input:

    frame   user1   user2   sum_result
0   0          0    0        0     
1   1          0    0        0
2   2          0    1        1
3   3          1    1        2
4   4          1    0        1
5   5          0    0        0

I want apply my_func with condition to `sum_result'.

That my condition: if sum_result=2 return 'ICV', if sum_result=1 and number of frame less(<=10) or equal number of frame with result 2 than return 'ReviewNG' (number of frame within sum_result=2 minus number of frame within sum_result=1, so result less or equal 10), if sum_result=0 return 'Other'.

For example output that I expected:

    frame   user1   user2   sum_result Result
0   0          0    0        0          Other
1   1          0    0        0          Other
2   2          0    1        1          ReviewNG
3   3          1    1        2          ICV
4   4          1    0        1          ReviewNG
5   5          0    0        0          Other

That my code:

def result_func(row):
for i in range(0,len(df)):
    if row==2:        
        return('|ICV')
    elif row==1 & (df['frame'][i]-df.loc[df['sum_result']==2,'frame'].iloc[0]<=10 | df['frame'][i]-df.loc[df['sum_result']==2,'frame'].iloc[-1]<=10):
        return('ReviewNG-ICV')
    elif row==0:
        return('Other')
    else:
        return ""

and applying on df:

df['result']=df['sum_result'].apply(lambda row: result_func(row))  

But I have Error:

IndexError: single positional indexer is out-of-bounds

I understand if in my df no condition to sum_result=2 it make error. How I can fix my function?

Upvotes: 0

Views: 475

Answers (2)

Rinshan Kolayil
Rinshan Kolayil

Reputation: 1139

def result_func(row):
    if row['sum_result'] == 2:
        return "ICV"
    elif row['sum_result'] == 1:
        new_frame = df.loc[df['sum_result']==2,'frame']
        if not new_frame.empty and (row['frame']-new_frame.iloc[0] <=10 or row['frame']-new_frame.iloc[-1] <=10):
             return('ReviewNG-ICV')
    elif row['sum_result'] == 0:
        return "Other"
    return "OTHER UNDEFINED VALUES"
df['result']=df[['frame','sum_result']].apply(result_func,axis=1)

If you do not wish to access new_frame in every loop you can pass arguments to apply funtion

def result_func(row,new_frame):
    if row['sum_result'] == 2:
        return "ICV"
    elif row['sum_result'] == 1:
        if not new_frame.empty and (row['frame']-new_frame.iloc[0] <=10 or row['frame']-new_frame.iloc[-1] <=10):
            return('ReviewNG-ICV')
    elif row['sum_result'] == 0:
        return "Other"
    return "OTHER UNDEFINED VALUES"
new_frame = df.loc[df['sum_result']==2,'frame']
df['result']=df[['frame','sum_result']].apply(result_func,args=(new_frame,),axis=1)

Output

from tabulate import tabulate
print(tabulate(df, headers='keys', tablefmt='psql'))

+----+---------+---------+---------+--------------+--------------+
|    |   frame |   user1 |   user2 |   sum_result | result       |
|----+---------+---------+---------+--------------+--------------|
|  0 |       0 |       0 |       0 |            0 | Other        |
|  1 |       1 |       0 |       0 |            0 | Other        |
|  2 |     100 |       0 |       1 |            1 | test         |
|  3 |      88 |       1 |       1 |            2 | ICV          |
|  4 |       4 |       1 |       0 |            1 | ReviewNG-ICV |
|  5 |       5 |       0 |       0 |            0 | Other        |
|  6 |      18 |       1 |       1 |            2 | ICV          |
+----+---------+---------+---------+--------------+--------------+

Hope it helps

Upvotes: 1

Zephyr
Zephyr

Reputation: 12524

If no row in the dataframe meets the condition sum_result=2, then the series df.loc[df['sum_result']==2,'frame'] is empty. In this case, you cannot acces the first or last element of it with df.loc[df['sum_result']==2,'frame'].iloc[0] or df['frame'][i]-df.loc[df['sum_result']==2,'frame'].iloc[-1]. This is what triggers your IndexError.
So, first of all, you should check if df.loc[df['sum_result']==2,'frame'] is actually empty with:

if df.loc[df['sum_result']==2,'frame'].empty:
    ...

an example of your code could be:

import pandas as pd

df = pd.read_csv('data.csv')

def result_func(row):
    for i in range(0,len(df)):
        if row==2:
            return('ICV')
        elif row==1:
            if df.loc[df['sum_result']==2,'frame'].empty:
                return ('No sum_result==2')
            else:
                if (df['frame'][i]-df.loc[df['sum_result']==2,'frame'].iloc[0]<=10 | df['frame'][i]-df.loc[df['sum_result']==2,'frame'].iloc[-1]<=10):
                    return('ReviewNG-ICV')
                else:
                    return('To be defined')
        elif row==0:
            return('Other')
        else:
            return ""

df['result']=df['sum_result'].apply(result_func)

Upvotes: 1

Related Questions