ArieAI
ArieAI

Reputation: 494

function does not work correctly with pandas.apply(lambda)

I have a function that takes two strings and give an output.

I would like to apply it on my pandas dataframe using panads' apply funciton (with Lambda).

The function runs correctly for certain inputs, but then fails in one of my checks. I double checked that the class of this example inputs is still string (two strings), and when I run the function with these strings outside pandas (just manually) it produces the expected output.

To be clear, apply.lambda runs well for several examples until fails on that particular one, which I then tested outside pandas and it works.

here is a simplified example (values in the dataframe do not matter in this example).

list1 = ['a','b','c']
list2 = ['d','e','f']

def calculate_test(b,e):
      if (not b in list1) or (not e in list2):
        raise ValueError("this should not happen!")
      else:
        return True

data = [['a','d'],['b','e'],['c','f']]
df = pd.DataFrame(data, columns=['first', 'second'])
# calculate_test('b','e')  # True

df['should_all_be_true'] = df.apply(lambda row: calculate_test(row['first'], row['second']),axis=1)  # ValueError raised!

I can imagine that the error is in my "if" statement - but can't spot it.

Upvotes: 0

Views: 931

Answers (1)

Muhammad Ali
Muhammad Ali

Reputation: 489

I hope it will work, you raise ValueError when it didn't find a df first, second column strings in list1 & list2 so I changed it to return False in if condition.

import pandas as pd
list1 = ['a','b','c']
list2 = ['d','e','f']

def calculate_test(b,e):
    if (not b in list1) or (not e in list2):
        return False
#         raise ValueError("this should not happen!")
    else:
        return True

data = [['a','d'],['b','e'],['c','f'], ['g','h']]
df = pd.DataFrame(data, columns=['first', 'second'])
# calculate_test('b','e')  # True
df
df['should_all_be_true'] = df.apply(lambda row: calculate_test(row['first'], row['second']),axis=1)  # ValueError raised!
df

or if it is must to include raise ValueError encolsed apply functionality in try-catch block

try:
    df['should_all_be_true'] = df.apply(lambda row: calculate_test(row['first'], row['second']),axis=1)  # ValueError raised!
except Exception as e:
    print(e)

Upvotes: 1

Related Questions