Reputation: 5367
I am finding the indexes of some values above certain cutoffs in a pandas DataFrame . So far I have achieved that using a series of lambda functions.
data.apply([lambda v:v[v>=0.25].idxmin(),
lambda v:v[v>=0.25].idxmin(),
lambda v:v[v>=0.50].idxmin(),
lambda v:v[v>=0.75].idxmin(),
lambda v:v[v>=0.90].idxmin()])
I have attempted to parametrize a lambda function to an arbitrary list of cutoff values. However, if I use the following, results are not correct as all lambda functions have the same name and basically only the last one is present in the dataframe returned by apply. How to parametrize these lambda correctly?
cutoff_values=[25,50,100]
agg_list=[lambda v,c:v[v>=(float(c)/100.0)].idxmin() for c in cutoff_values]
data.apply(agg_list)
What would be a pythonic-pandasque better approach?
Upvotes: 0
Views: 692
Reputation: 4239
You can use this:
df = pd.DataFrame(data={'col':[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]})
df = df[['col']].apply(lambda x: [x[x >= (float(c) / 100.0)].idxmin() for c in cutoff_values])
Upvotes: 1
Reputation: 863801
For me working nested lambda functions like:
q = lambda c: lambda x: x[x>=c].idxmin()
cutoff_values=[25,50,90]
print (data.apply([q((float(c)/100.0)) for c in cutoff_values]))
Upvotes: 3