Reputation: 3608
I have:
pd.DataFrame({'col1':['a','b','c','d'],'col2':["","",3,4],'col3':["",3,4,6],'col4':[3,4,5,6]})
I want: to find the column name/index location from each row of the leftmost column that has data
Upvotes: 3
Views: 614
Reputation: 863281
If missing values are emty strings compare not equal empty string for all columns without first by DataFrame.iloc
and use DataFrame.idxmax
:
df['new'] = df.iloc[:, 1:].ne('').idxmax(axis=1)
Or if missing valueas are empty strings or NaN
s:
df['new'] = df.iloc[:, 1:].replace('',np.nan).notna().idxmax(axis=1)
Or if NaNs:
df['new'] = df.iloc[:, 1:].notna().idxmax(axis=1)
print (df)
col1 col2 col3 col4 new
0 a 3 col4
1 b 3 4 col3
2 c 3 4 5 col2
3 d 4 6 6 col2
If possible all values are empty then is selected first value, you need extrac mask for catch it:
df = pd.DataFrame({'col1':['a','b','c','d'],'col2':["","",3,""],
'col3':["",3,4,""],'col4':[3,4,5,""]})
# df['new'] = df.iloc[:, 1:].ne('').idxmax(axis=1)
mask = df.iloc[:, 1:].ne('')
df['new'] = mask.idxmax(axis=1).where(mask.any(axis=1), 'no match')
print (df)
col1 col2 col3 col4 new
0 a 3 col4
1 b 3 4 col3
2 c 3 4 5 col2
3 d no match
Upvotes: 3