Reputation: 2183
I have the following example dataframe:
d = {'target': [1, 2, 4, 3, 6, 5]}
df = pd.DataFrame(data=d)
df
Output:
target
0 1
1 2
2 4
3 3
4 6
5 5
I need a function that will do the following:
Let the function have the name find_index_of_first_hit(value)
.
The function...
value
with elements of the column target
.value
.index
of the dataframe row for the very first match.Example:
find_index_of_first_hit(3)
Should return 2
which is the index of the target
column value 4, which is where the column value (which is 4) is >= the function input value 3 for the first time in the column. And the index is 2, which is expected to be returned.
The original dataframe is fairly large and I wonder how I can write such a program without using for loop.
This function needs to be written in Python and it needs to be a fast solution, which is why I would like to avoid for loop. Performance is important here.
How can I write such a Python function doing this work?
Upvotes: 1
Views: 555
Reputation: 23099
use an equality check .eq
with idxmax
You'll find you rarely need to write any functions for Pandas (unless you need to package up reusable code snippets) as most things are available in the API.
index = df.ge(3).idxmax()
target 2
dtype: int64
Upvotes: 1
Reputation: 862581
Use Series.idxmax
with test if value exist in if-else
with Series.any
:
def find_index_of_first_hit(val):
a = df['target'].ge(val)
return a.idxmax() if a.any() else -1
print (find_index_of_first_hit(3))
2
print (find_index_of_first_hit(30))
-1
Upvotes: 3