Reputation: 941
Given this df:
Name i j k
A 1 0 3
B 0 5 4
C 0 0 4
D 0 5
My goal is to add in a column "Final" that takes value in an order of i j k:
Name i j k Final
A 1 0 3 1
B 0 5 4 5
C 0 0 4 4
D 0 5 <-- this one is tricky. We do count the null for j column here.
Here is my attempt: df['Final'] = df[['i', 'j', 'k'].bfill(axis=1).iloc[:, 0]. This doesn't work since it always takes the value of column 1. Any help would be appreciated. :)
Many thanks!
Upvotes: 0
Views: 58
Reputation: 28303
Using pandas.Series.nonzero the solution can be expressed succicntly.
df['Final'] = df.apply(lambda x: x.iloc[x.nonzero()[0][0]], axis=1)
How this works:
nonzero()
returns the indices of elements that are not zero (and will match np.nan as well).
We take the first index location and return the value at that location to construct the Final
Column.
We apply this on the dataframe using axis=1
to apply it row by row.
A benefit of this approach is that it does not depend on naming individual columns ['i', 'j', 'k']
Upvotes: 1
Reputation: 353359
If by "taking values in column order", you mean "taking the first non-zero value in each row, or zero if all values are zero", you could use DataFrame.lookup
after doing a boolean comparison:
In [113]: df["final"] = df.lookup(df.index,(df[["i","j","k"]] != 0).idxmax(axis=1))
In [114]: df
Out[114]:
Name i j k final
0 A 1 0.0 3 1.0
1 B 0 5.0 4 5.0
2 C 0 0.0 4 4.0
3 D 0 NaN 5 NaN
where first we compare everything with zero:
In [115]: df[["i","j","k"]] != 0
Out[115]:
i j k
0 True False True
1 False True True
2 False False True
3 False True True
and then we use idxmax
to find the first True (or the first False if you have a row of zeroes):
In [116]: (df[["i","j","k"]] != 0).idxmax(axis=1)
Out[116]:
0 i
1 j
2 k
3 j
dtype: object
Upvotes: 2
Reputation: 323326
Is this what you need ?
df['Final']=df[['i', 'j', 'k']].mask((df=='')|(df==0)).bfill(axis=1).iloc[:, 0][(df!='').all(1)]
df
Out[1290]:
Name i j k Final
0 A 1 0 3 1.0
1 B 0 5 4 5.0
2 C 0 0 4 4.0
3 D 0 5 NaN
Upvotes: 1