Reputation: 5361
Is there a clean way to filter Pandas Series using a custom function that takes as input both index and value?
Here is a piece of code that achieves what I want to do:
series = pd.Series({"id5":88, "id3":40})
def custom(k,v):
if k=="id5":
return v>20
else:
return v>50
filtered_indexes = []
filtered_values = []
for k,v in series.iteritems():
if custom(k,v):
filtered_indexes.append(k)
filtered_values.append(v)
filtered_series = pd.Series(data=filtered_values, index=filtered_indexes)
My question is: can the same be achieved cleaner and/or more efficiently with syntax like
series.filter(lambda x: custom(x.index, x.value))
Upvotes: 2
Views: 4548
Reputation: 164843
You can vectorise your logic as below. This avoids inefficient lambda
loops and may also make your code cleaner.
res = series[((series.index == 'id5') & (series > 20)) |
((series.index != 'id5') & (series > 50))]
Result:
id5 88
dtype: int64
For readability, you may wish to separate the Boolean criteria:
c1 = ((series.index == 'id5') & (series > 20))
c2 = ((series.index != 'id5') & (series > 50))
res = series[c1 | c2]
Upvotes: 1
Reputation: 863751
There is problem Series.apply
have no accesses to index and DataFrame.filter
is not implemented for Series
.
It is possible, but need create DataFrame
:
s = series[series.to_frame().apply(lambda x: custom(x.name, x), axis=1).squeeze()]
print (s)
id5 88
dtype: int64
Or use groupby
with filtration:
s = series.groupby(level=0).filter(lambda x: custom(x.name, x)[0])
print (s)
id5 88
dtype: int64
Upvotes: 1