Reputation: 3598
Given an example dataframe df
:
df = (pd.DataFrame.from_dict({'a':[np.nan, 10, 20],
'b':[np.nan, np.nan, 20],
'c':[np.nan, 30, np.nan],
'c':[40, np.nan, np.nan],
'col':['b','c','a']})
)
a b c col
0 NaN NaN 40.0 b
1 10.0 NaN NaN c
2 20.0 20.0 NaN a
I want to fillna(0)
at each row of df
only selected column according to value stored in df.col
.
The output should be:
a b c col
0 NaN 0.0 40.0 b
1 10.0 NaN 0.0 c
2 20.0 20.0 NaN a
I've found an iterative solution:
for index, row in df.iterrows():
df.loc[index, row.col] = np.nanmax([df.loc[index, row.col],0])
but it is slow, and hard to use is chaining operations (assign
method).
Is there another, maybe vectorized way or apply
version of getting desired result?
Upvotes: 1
Views: 99
Reputation: 195553
You can use .transform()
with np.where
:
print(
df.transform(
lambda x: np.where(x.isna() & (x.index == x["col"]), 0, x),
axis=1,
)
)
Prints:
a b c col
0 NaN 0 40 b
1 10 NaN 0 c
2 20 20 NaN a
Upvotes: 2
Reputation: 42926
df = df.set_index('col').stack(dropna=False)
m1 = df.index.get_level_values(0) == df.index.get_level_values(1)
m2 = df.isna()
df.loc[m1 & m2] = 0
df = df.unstack(level=1).reset_index()
We can melt the dataframe, then find the rows which we need to fill the NaN. Finally pivot our dataframe back in the format we want:
m = df.melt(id_vars='col')
m['value'] = np.where(
m['col'] == m['variable'],
m['value'].fillna(0),
m['value']
)
df = m.pivot_table(
index='col',
columns='variable',
values='value'
).reindex(df['col'])
df = df.reset_index().rename_axis(columns=None)
Output
col a b c
0 b NaN 0.00 40.00
1 c 10.00 NaN 0.00
2 a 20.00 20.00 NaN
Upvotes: 2