Reputation: 5395
Say I have the following data frame (note, it is only for the purpose of illustration, not for the actual problem to solve)
#df = pd.DataFrame({"id":[1,1,1,2,2,2],
#"purchase":[True,False,False,False,True,True],
#"prod":["Apple","Pear","Banana"]*2})
id purchase prod
----+-----+--------+
1 True Apple
1 False Pear
1 False Banana
2 False Apple
2 True Pear
2 True Banana
and a function to return only the purchased products
def get_prod_purch(df):
"""
Get products
"""
x = df["purchase"]
return df.loc[x]
If I run this as a groupby
it works perfect:
df.groupby("id").apply(get_prod_purch)
#
id purchase prod
id ----+-----+-------+
1 0 1 True Apple
2 4 2 True Pear
5 2 True Banana
but if I just want to run it on the dataframe
df.apply(get_prod_purch)
#KeyError: 'purchase'
df.apply(get_prod_purch,axis=1)
#KeyError: True
Is there a way to run such a function on the dataframe and not the groupby
i.e
df.apply(some_function)
#Result
id purchase prod
----+-----+--------+
1 True Apple
2 True Pear
2 True Banana
Upvotes: 1
Views: 37
Reputation: 863791
Use DataFrame.pipe
, because need apply function for all DataFrame
:
print (df.pipe(get_prod_purch))
id purchase prod
0 1 True Apple
4 2 True Pear
5 2 True Banana
Or pass DataFrame
to function like:
print (get_prod_purch(df))
id purchase prod
0 1 True Apple
4 2 True Pear
5 2 True Banana
If use DataFrame.apply
function runs per columns or per rows axis=1
.
Upvotes: 1