Reputation: 357
Say I have the following DataFrame:
df = pd.DataFrame({'a': [12, 34, -45], 'b':[-24, 36, 48], 'c':[28, -14, 68]})
df:
a b c
12 -24 28
34 36 -14
-45 48 68
I am looking to return the index(+1) of the first column to contain a negative number within each row, so for the example I would produce:
a b c first_neg_col
12 -24 28 2
34 36 -14 3
-45 48 68 1
I have ways of achieving this:
def first_negval(val_list):
for idx, val in enumerate(val_list):
if val < 0:
return idx + 1
df['first_neg_col'] = df[:].values.tolist()
df.first_neg_col= df['first_neg_col'].apply(lambda x: first_negbal(x))
But this seems cumbersome/inefficient. I was wondering if there was a more vectorized approach / some way of using list comprehension?
Upvotes: 2
Views: 680
Reputation: 862611
If always exist at least one negative value use numpy.argmax
for first negative value less like 0
:
df['first_neg_col'] = np.argmax(df.lt(0).to_numpy(), axis=1) + 1
print (df)
a b c first_neg_col
0 12 -24 28 2
1 34 36 -14 3
2 -45 48 68 1
Generally is necessary test if exist at least one negative and set to 0
in numpy.where
with DataFrame.any
:
df = pd.DataFrame({'a': [12, 34, -45, 1], 'b':[-24, 36, 48, 8], 'c':[28, -14, 68, 8]})
m = df.lt(0)
df['first_neg_col'] = np.where(m.any(axis=1), np.argmax(m.to_numpy(), axis=1) + 1, 0)
print (df)
a b c first_neg_col
0 12 -24 28 2
1 34 36 -14 3
2 -45 48 68 1
3 1 8 8 0
Upvotes: 2