Ragnar
Ragnar

Reputation: 2690

Find the first index of the DF between 2 (or more) value

Having a df with pandas, I want to have the first index occurrence of "V1" | "V2" if possible without having to scan all the DF. Can I make it stop at the first match ?

I started doing i = df[(df["track"] == "V1") | (df["track"] == "V2")].iloc[0] but I got the full row and have a list.

Upvotes: 0

Views: 81

Answers (5)

Andy L.
Andy L.

Reputation: 25259

You may use where and first_valid_index to handle case when V1 and V2 not found. In that case, first_valid_index returns None

idx = df.where(df.track.isin(['V1', 'V2'])).first_valid_index()

Upvotes: 1

ansev
ansev

Reputation: 30930

Use Series.isin with DataFrame.index or Series.index

df[df.track.isin(['V1', 'V2'])].index[0]

or using callable

df.track.loc[lambda x: x.isin(['V1', 'V2'])].index[0]

Upvotes: 1

jezrael
jezrael

Reputation: 863216

Use a bit changed this solution - for testing use in operator - it loop only to matching like your requirement:

from numba import njit

@njit
def get_first_index_nb(A, k):
    for i in range(len(A)):
        if A[i] in k:
            return i
    return None

#pandas 0.24+
idx = get_first_index_nb(df.track.to_numpy(), ['V1', 'V2'])
#oldier pandas versions
#idx = get_first_index_nb(df.track.values, ['V1', 'V2'])
print (idx)

Solution with Series.idxmax if possible no values matching with if-else statement and Series.any, but it test all matching values:

m = df.track.isin(['V1', 'V2'])
idx = m.idxmax() if m.any() else None

Or:

idx = next(iter(df.index[df.track.isin(['V1', 'V2'])]), None)

Upvotes: 2

High-Octane
High-Octane

Reputation: 1112

Just use your same code but with below changes.

From this

i = df[(df["track"] == "V1")  | (df["track"] == "V2")].iloc[0]

to

df[(df["track"] == "V1")  | (df["track"] == "V2")].head(1).iloc[0]

Upvotes: 0

yatu
yatu

Reputation: 88276

Looks like you want:

df.track.isin(['V1', 'V2']).idxmax()

If you want to stop on the first match here's one way using a generator comprehension:

match = {'V1', 'V2'}
next((ix for ix, i in enumerate(df.track.values) if i in match), None)
# 1

Upvotes: 2

Related Questions