Reputation: 673
I have following dataframe:
df:
A B_x B_y C_x C_y
R1 0 3 6 7
R2 NAN 4 8 9
R3 2 5 NAN 2
I am looking to replace NAN values from columns with _x with corresponding columns of _y.
I cannot use absolute names like B_x and B_y since the column names are dynamically derived from previous code and i have no control over it.
Right now i am using the following:
ens_prefix is a variable which holds values of column along with _x
ens_prefix_1 is a variable which holds values of column along with _y
df[ens_prefix].fillna(df[ens_prefix_1], inplace=True) # replace values of NAN from _x column with _y
df = df.filter(regex=r'.*(?<!_y)$') # remove columns with _y suffix
df.columns = df.columns.str.rstrip('_x') # strip suffix at the right end only.
Expected output:
A B C
0 R1 0.0 6.0
1 R2 4.0 8.0
2 R3 2.0 2.0
Upvotes: 2
Views: 219
Reputation: 71689
We can iterate over all the columns in df
find the column which ends with _x
, then fill the nan
value in this column with the corresponding column having _y
as suffix:
for col in df.columns[:]:
if col.endswith('_x'):
df[col.rstrip('_x')] = df.pop(col).fillna(df.pop(col.rstrip('_x') + '_y'))
Result:
# print(df)
A B C
0 R1 0 6
1 R2 4 8
2 R3 2 2
Upvotes: 0
Reputation: 863031
First I prefer use replace
in columns names instead strip
, because strip
should aslo remove all last x
, y
values not only after _
, but also before _
.
Solution useDataFrame.fillna
with replaced _x
to empty strings with selected last _y
and removed columns by _y
in last step:
df = (df.rename(columns = lambda x: x.replace('_x', ''))
.fillna(df.filter(regex='_y$')
.rename(columns = lambda x: x.replace('_y', '')))
.filter(regex=r'.*(?<!_y)$'))
print (df)
A B C
0 R1 0.0 6.0
1 R2 4.0 8.0
2 R3 2.0 2.0
Upvotes: 2