Reputation: 377
I have a data-frame
ID P_1 P_2
1 NaN NaN
2 124 342
3 NaN 234
4 123 NaN
5 2345 500
I want to make a new column titled P_3 such that:
ID P_1 P_2 P_3
1 NaN NaN NaN
2 124 342 342
3 NaN 234 234
4 123 NaN 123
5 2345 500 500
My conditions are:
if P_1 = Nan , then P_3 == P_2
if P_1 != Nan and P_2 != Nan, then P_3 == P_2
if P_2 = Nan , then P_3 == P_1
I have applied the following codes:
conditions = [
(df['P_1'] == float('NaN')),
(df['P_1'] != float('NaN')) & (df['P_2'] != float('NaN')),
(df['P_1'] != float('NaN')) & (df['P_2'] == float('NaN'))
]
values = [df['P_2'], df['P_2'], df['P_1']]
df['P_3'] = np.select(conditions, values)
But it gives me the following error:
Length of values does not match length of index
Upvotes: 1
Views: 62
Reputation: 113955
Another approach:
In [93]: df
Out[93]:
p1 p2
0 NaN NaN
1 124.0 342.0
2 NaN 234.0
3 123.0 NaN
4 2345.0 500.0
In [94]: df['p3'] = df.p2
In [95]: df
Out[95]:
p1 p2 p3
0 NaN NaN NaN
1 124.0 342.0 342.0
2 NaN 234.0 234.0
3 123.0 NaN NaN
4 2345.0 500.0 500.0
In [96]: df.loc[df.p3.isna(), 'p3'] = df[df.p3.isna()]['p1']
In [97]: df
Out[97]:
p1 p2 p3
0 NaN NaN NaN
1 124.0 342.0 342.0
2 NaN 234.0 234.0
3 123.0 NaN 123.0
4 2345.0 500.0 500.0
Upvotes: 2
Reputation: 120409
In summary, your unique condition is:
P_3 = P_2 if P_2 != NaN else P_1
combine_first
: update null elements with value in the same location in other (ref: Pandas doc.)
>>> df["P_2"].combine_first(df["P_1"])
ID
1 NaN
2 342.0
3 234.0
4 123.0
5 500.0
Upvotes: 2