frank
frank

Reputation: 3608

how to find leftmost column with data pandas

I have:

pd.DataFrame({'col1':['a','b','c','d'],'col2':["","",3,4],'col3':["",3,4,6],'col4':[3,4,5,6]})

enter image description here

I want: to find the column name/index location from each row of the leftmost column that has data

enter image description here

Upvotes: 3

Views: 614

Answers (1)

jezrael
jezrael

Reputation: 863281

If missing values are emty strings compare not equal empty string for all columns without first by DataFrame.iloc and use DataFrame.idxmax:

df['new'] = df.iloc[:, 1:].ne('').idxmax(axis=1)

Or if missing valueas are empty strings or NaNs:

df['new'] = df.iloc[:, 1:].replace('',np.nan).notna().idxmax(axis=1)

Or if NaNs:

df['new'] = df.iloc[:, 1:].notna().idxmax(axis=1)

print (df)
  col1 col2 col3  col4   new
0    a               3  col4
1    b         3     4  col3
2    c    3    4     5  col2
3    d    4    6     6  col2

If possible all values are empty then is selected first value, you need extrac mask for catch it:

df = pd.DataFrame({'col1':['a','b','c','d'],'col2':["","",3,""],
                    'col3':["",3,4,""],'col4':[3,4,5,""]})

# df['new'] = df.iloc[:, 1:].ne('').idxmax(axis=1)

mask = df.iloc[:, 1:].ne('')
df['new'] = mask.idxmax(axis=1).where(mask.any(axis=1), 'no match')
print (df)
  col1 col2 col3 col4       new
0    a              3      col4
1    b         3    4      col3
2    c    3    4    5      col2
3    d                 no match
    

Upvotes: 3

Related Questions